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Description 

[0001] The present invention relates to synthetic gag and gagpol genes optimized for high level expression via codon 
optimization and the uses thereof for the efficient generation of vector particles. The invention further relates to the 

5 generatiohof packaging cells and vaccines based on the synthetic gag and gagpol genes. 

[0002] Retroviral based vectors for DNA vaccination or gene therapy should require the ability to express structural 
genes and enzymatic functions in high yields in order to obtain maximal efficiency and yet, at the same time, with 
lowest possible risk and side effects for the recipient. Efficient production of structural or enzymatic proteins from 
complex retroviruses like HTLV-1 and -2, Lentiviruses and Spumaviruses is limited by their complex regulatory mech- 

10 anism of transcription, nuclear RNA translocation and expression. For example the expression of SIV and HIV-1 late 
gene products is highly regulated both on transcriptional and post-transcriptional level therefore depending on the 
presence and function of early phase proteins like Tat and Rev. Whereas the anti-repressor function of Tat can be 
easily avoided by using heterologous transcriptional control via viral and/or cellular promotor/enhancer units, the ex- 
pression of late genes, coding for structural or enzymatic functions (gag, env, pol) depends on the presence of the 

is transactive Revprotein, known to promote the export of un- and partially spliced RNAs from the nucleus to the cytoplasm 
via the interaction with its cognate RNA recognition site Rev responsive-element (RRE), a complex of a 351 nucleotides 
(nt) long RNA stem-loop structure located within the env open-reading-frame. Although Rev/RRE action is well accept- 
ed as a necessary prerequisite for HIV-1 late gene expression, the critical contribution of different cis-acting elements 
within the un- and singly spliced transcripts to Rev dependency and timely regulated expression is still controversially 
20 discussed. Accordingly, Rev dependent and timely regulated export of late singly or unspliced lentiviral mRNAs has 
been explained in most reports either by inefficient splice site formation or attributed to inhibitory sequences located 
within the coding region (referred to as INS elements). 

[0003] Whereas the exact nature and function of so called INS elements is presently not completely clear, Rev de- 
pendent nuclear export and expression of the un- and singly spliced lentiviral mRNAs, at least when expressed from 
25 heterologous promotors, seems to engage and require the major splice donor site located in the untranslated region 
5' of the Gag coding sequence (S'-UTR). This ^-UTR also contains part of the RNA packaging signal (\|/), which - in 
contrast to the y-sites of traditional C and D-type retroviruses - also extends into the coding area of lentiviral genomes 
including the Gag and Pol reading frame. 

[0004] Although lentiviral vectors derived from complex retroviruses like HTLV-1 and -2, Lentiviruses and Spumavi- 

30 ruses offer great promise in the field of DNA vaccination and/or gene therapy, concerns regarding safety in humans 
still exist, due to the striking pathogenic potential of HTLV's and lentiviruses like HIV-1 or HIV-2. Moreover, due to the 
complex viral regulatory mechanisms underlying Gag and GagPol expression efficient expression is either limited by 
or depends heavily on the presence of cis-acting elements (UTR, RRE) and transactive proteins (Rev, Tat). Tat-inde- 
pendent transcription can be easily achieved by employing a constitutive mammalian promotor like the Cytomegalovirus 

35 immediate early promotor/enhancer unit. Rev independent expression of late HIV-1 genes such as those coding for 
structural or enzymatic proteins would be highly appreciable in order to improve efficiency and safety of lentiviral based 
gene therapy vectors and DNA vaccines as well as for antigen production in a mammalian expression system. Gen- 
erally, two systems can be applied: (a) The Mason-Pfitzer monkey virus (MPMV) constitutive transport element (CTE), 
a cis-acting RNA element located within the 3' untranslated region (UTR) of the viral genome, can substitute the HIV- 

40 1 Rev/RRE regulatory system. In addition, several other cis-acting RNA elements of a variety of viruses also promoting 
nuclear export of intron containing RNAs were discovered (e.g. Rous sarcoma virus, Simian retrovirus type D, Avian 
leukemia virus, HBV). Accordingly, the addition of the MPMV-CTE element or functional analogous cis-acting RNA 
elements overcomes the effect of the so called INS elements, which have been proposed to account for the nuclear 
sequestration of late lentiviral RNAs in the absence of Rev. (b) Low or no gene expression of late HIV-1 gene products 

45 in absence of Rev has been partially overcome by clustered single point mutations within the wobble positions of 
selected codons within coding DNA sequences (Schwartz etal. 1992a J. Virol. 66:7176-7182; Schneider et aL 1997, 
J. Virol 71:4892-4903; Haas et at. 1996, Curr. Biol.: 6:315-324). This approach aimed to destroy the so called INS 
elements in order to render the expression of the resulting Gag gene independent of Rev. However, the presence, 
nature and function of INS elements is still discussed controversially, indicating that a complete knockout of poorly 

so characterized INS elements on the basis of single point mutations may be difficult to achieve. This view is confirmed 
by the observation of Qiu and colleagues (Qiu etal 1999 J Virol. Nov;73(11):91 45-52.), showing that the addition of 
a CTE element to the (only partially) altered Gag gene leads to a further increase of Gag expression. This phenomenon 
does not hold true for a completely Rev-independent Gag expression construct. 

[0005] A preferred application of HTLV-1 and -2 as well as Lentivirus derived Gag and GagPol based expression 
55 vectors is DNA vaccination and the production of antigens, Preferably, virus-like particles, in higher eucaryotic expres- 
sion systems like in mammalian and insect cells. Cell mediated immune responses to Gag and Pol products of HIV-1 
were shown to protect from disease or at least to contribute to an efficient control of virus replication. Accordingly, in 
persons with chronic infection, HIV-1 -specific proliferative responses to p24 as well as the number of Gag specific CTL 
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precursors were inversely related to the determined viral loads. Moreover, in CTL responses directed towards highly 
conserved epitopes within Gag or Pol seem to critically contribute to the control of virus replication in long-term non- 
progressing HFV infected individuals. 

[0006] Different formats have been devised in the past towards presenting Gag and Pol containing immunogens to 
5 the immune system, recombinant virus-like particles (VLPs) and Gag/Pol containing DNA constructs being amongst 
them. VLPs were usually produced in a baculovirus driven insect cell expression system, which allows to bypass the 
complex regulation of Gag and Pol expression seen in the natural host or mammalian cells. Such VLPs turned out to 
be highly immunogenic in rodents as well as in nonhuman primates. However, to compile with the regulations of Good 
Manufacturing Practice (GMP) and in order to achieve proper posttranslational modification of Gag/Pol polypeptides 
10 (including coexpressed proteins that might be packaged in or presented by VLPs), expression of GagPol and derivatives 
thereof in mammalian cell lines would be highly appreciated. 

[0007] Vaccination by direct injection of DNA is a novel and promising method for delivering specific antigens for 
immunization. Plasmid DNA immunization has potential advantages compared to traditional protein vaccination due 
to the strong T helper 1 (Th1 ) and CTL responses induced, the prolonged antigen expression and the long lived effector 
15 activity, and thus can be used for vaccination. In animal models, DNA vaccination has been shown to induce protective 
immunity against a variety of viral and parasitic pathogens. In most cases, strong, yet highly variable, antibody and 
cytotoxic T cell responses were associated with control of infection. Plasmid DNAs expressing genes derived from 
HIV-1 have been recently shown to induce humoral and cellular immune responses in rodents, in nonhuman primates 
and in phase I studies in humans. Although these constructs were able to induce an immune response, both circulating 
20 antibody titers and HIV-1 specific CTL levels were transient and low. 

[0008] However, the development of DNA constructs and infectious, although not necessarily replication-competent, 
vectors expressing Gag or GagPol or derivatives thereof for antigen production in eucaryotic cells and for vaccination 
purposes faces several limitations both, regarding safety and efficiency. Gag and GagPol expression using wildtype 
genes in the Rev-dependent situation is limited (a) by the complex viral regulatory mechanisms which involves several 
25 cis-acting elements (RRE, UTR) and trans-acting proteins (Rev, Tat). In absence of UTR, RRE, or Rev no or only 
minute amounts of Gag or GagPol protein will be produced. Gag or Gag/Pol expression therefore needs simultaneous 
expression of the Rev protein either from a bicistronic construct or from a separate plasmid. Both strategies limit the 
efficiency of generating cell lines in vitro or reduce the efficacy of DNA vaccine constructs in vivo either due to an 
increase plasmid size or the necessity to transfect/transduce one single cell with both plasmids at the same time; and 
30 (b) by the presence of the Rev protein itself, acting as an RNA shuttle between nucleus and therefore harboring an 
intrinsic risk. Moreover, when applied as a therapeutically vaccination in chronically HIV infected individuals, Rev could 
contribute towards reactivating latent viruses, thereby enhancing the infection, virus replication and disease. 
[0009] Moreover, Gag and GagPol expression - irrespective of accomplished by the Rev/RRE system or mediated 
independent from Rev/RRE by addition of a constitutive transport element (CTE) or analogous cis acting sites - is 
35 limited by the presence of UTR, RRE and several stretches in the GagPol coding area, which are known to comprise 
the HIV-1 packaging signal y or other RNA elements responsible for packaging the genomic RNA into viral particles. 
Vectors comprising such RNA elements could be packaged into and distributed by endogenous retroviral particles and 
contribute towards the mobilization of endogenous retroviruses. Additionally, Gag or GagPol derived VLPs generated 
either in cell culture or in vivo following DNA vaccination would self package their own RNA, which precludes any safe 
40 application of such DNA vaccine constructs for vaccination purposes in humans. 

[0010] Moreover, any kind of Gag and GagPol expression - irrespective of being achieved (i) by Rev/RRE system, 
(ii) by addition of CTE or analogous sites or (iii) after introduction of single point mutations in the Gag or GagPol gene 
- is limited by the wildtype sequence itself in the case of a Gag or GagPol DNA vaccine. Recombination events could 
occur between the vaccine construct itself (UTR, RRE, wildtype ORF) and endogenous retroviral sequences or, when 
45 administered as a therapeutic vaccine, between the nucleic acids of the vaccine construct and the genetic information 
of the virus circulating within the patient. Both could lead to the reconstitution of potentially infectious or chimeric 
particles inheriting unknown risks. 

[0011] Another preferred application of lentiviral Gag and GagPol based vectors is gene transfer into different types 
of dividing, resting or postmitotic cells. In contrast to standard retroviral gene transfer as mediated e.g. by Moloney 

50 Murine Leukemia Virus (MoMuLV) based vectors, lentiviral gene transfer can be used for gene delivery into a variety 
of quiescent or non-dividing cells, such as neuronal, muscle, and liver as well as hemapoietic stem cells and might 
therefore considerably contribute to the prevention and treatment of genetic disorders, tumor diseases and infections. 
The transduction of non-dividing, resting or postmitotic cells as neuronal, hepatical tissue or hematopoetic progenitor 
cells by for the treatment of variety of genetic disorders or acquired diseases. The lentiviral vector particles are currently 

55 prepared by triple-transfection of a GagPol expression plasmid together with a transfer construct (containing the pack- 
aging signal and transfer gene; flanked by LTRs), and expression plasmid encoding an envelope protein derived from 
an amphotropic or xenotropic retrovirus like Moloney Murine Leukemia Virus or Vesicular stomatitis virus into mam- 
malian cell lines. Lentiviral vector particles from the supernatant of such transfected cells were able to stable transduce 
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a large variety of different cells even after in vivo gene transfer. To increase the safety of these vectors, most of the 
accessory genes of HIV-1 were deleted from the packaging construct and the vector, thereby minimizing the risk for 
the emergence of potentially pathogenic replication competent recombinants. 

[0012] However, the development of lentiviral Gag and GagPol based vectors for gene transfer into quiescent or 

5 non-dividing cells faces several limitations regarding safety and efficiency. 

[0013] The currently used HIV-1 or SIV derived GagPol expression plasmids contain parts of the 5'-untranslated 
region comprising the RNA packaging signal y. Although small deletions were introduced into the assumed packaging 
signal, it is unlikely that these deletions completely prevent packaging, since similar deletions in the context of an intact 
5'leader sequence reduced packaging by less than a factor of 25 (Luban and Goff, 1994 J Virol. Jun;68(6):3784-93). 

10 [0014] Moreover, if the GagPol expression plasmid can indeed be packaged by homologous or non-homologous 
recombination events between the vector RNA and the GagPol RNA during or after reverse transcription can not be 
excluded. This could lead to the - for safety reasons - undesired transfer of the GagPol gene (or parts of it) into target 
ceils. 

[0015] Moreover, there are two regions of homology between the GagPol expression plasmids and the lentiviral 
15 transgene vector constructs: (1 ) Rev-dependent as well as CTE mediated, Rev independent GagPol expression {pack- 
aging proteins) requires either the complete UTR or at least part of it for efficient expression. Also the transgene vector 
constructs depend for efficient packaging on the RNA packaging signal that overlaps the 5'-UTR. (2) Since in HIV-1 , 
SIV and other lentiviruses the packaging signal extends from the UTR into the gag gene (reviewed in Swanstrom-R 
and Wills-JW, 1 997, Cold Spring Harbor: Cold Spring Harbor Laboratory Press, pp. 263-334.), the 5-portion of the gag 
20 gene is part of many lentivirus derived transgene vector constructs. (3) Finally both, the transgene vector construct 
and Rev-dependent GagPol expression plasmids contain the Rev-responsive element, since (i) RRE was suggested 
to include RNA packaging functions and (ii) the transport of the transgene vector RNA and the GagPol RNA from the 
nucleus to the cytoplasm is Rev-dependent. These regions of extensive homology might facilitate homologous recom- 
bination events during either vector production or reverse transcription. 
25 [0016] The problem underlying the present invention is the development of retroviral based expression vectors al- 
lowing high level protein production in absence of transactive proteins and with lowest risk of self packaging of the 
vector or recombination events thereby avoiding the risk for reconstitution of potentially infectious or by other means 
hazardous particles. A further problem underlying the present invention is the provision of vaccines against diseases 
caused by Lentiviruses. 

30 [0017] The problem of the invention is solved by the subject matter defined in the claims. 
[0018] The present invention is further illustrated by the figures. 

FIG. 1: 

35 [0019] Schematic representation of wildtype and synthetic gag encoding expression plasmids. Open boxes indicate 
wildtype gag (wtgag) encoding genes including 3' and 5' located cis-acting sequences, whereas syngag without any 
3' or 5' untranslated regions (UTR) is boxed black. Wtgag reading frames were fused to the cis-acting sequences 5' 
located UTR, the Rev responsive-element (RRE), the constitutive transport element (CTE), or combination thereof. To 
investigate the influence of the major splice donor (SD) on CTE mediated Rev-independent Gag expression, the UTR 

^o were mutated in orderto destroy the splice-donor consensus sequence within this upstream located sequence, resulting 
in mutant ASD-wtgag-CTE. 

FIG. 2: 

45 [0020] Representation of syngag and wildtype gag open reading frames. (A) The Gag coding sequence was adapted 
to a codon usage occurring in highly expressed mammalian genes and aligned to the corresponding wildtype sequence 
shown above. Sequence identity between the synthetic and wildtype gene is indicated by lines. (B) The table indicates 
the different sequence composition of the wildtype and synthetic gag reading frame. The fully synthetic gag gene with 
an optimized codon usage shows a markedly increase in G/C content whereas the overall A/U content is reduced. 

so Moreover codon usage adaptation increased the amount of immune-stimmulatory CpG islets (Pur-Pur-C-G-Pyr-Pyr) 
within the coding region from only 1 to 5. 

FIG. 3: 

55 [0021] H1299 cells were transiently transfected with indicated plasmids. For standardization Gag expression was 
compared to wildtype like Rev/RRE dependent expression by a Rev expression vector, whereas Rev-independent 
expression plasmids were co-transfected with vector (indicated as + or - respectively). (A) Rev/RRE mediated wildtype 
expression. (B) CTE-mediated wildtype expression. (C) Synthetic expression using a codon optimized reading frame. 
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Cells were harvested 48 hours post-transfection, lysed and 50 jig of total protein subjected to Western Blot analysis 
(A-C, lower panel). Yields of Pr559 a 9 were measured by testing different dilutions of cell-lysate in a capture- ELISA 
using purified Pr559 a 9 for standardization (A-C, upper panel) and were normalized by the Pr559 a 9 production obtained 
by autologous Rev/RRE dependent expression.. Gag-expression levels obtained by transfection of different cell lines 
were performed to rule out cell type specific effects (D). Bars represent relative Pr559 a g expression levels (taken UTR- 
wtgag-RRE+Rev as 100%) and are given as the mean of triplicate determinations. 

FIG. 4: 

[0022] Immunological analysis of an HIV-1 DNA Gag vaccine based on synthetic and wildtype sequences. 80 u.g of 
wildtype (wt-gag) and synthetic (syngag) encoding DNA expression plasmids were injected intramuscularily (im) into 
BALB/c mice and boosted twice, after 3 weeks and 6 weeks and sacrificed after 7 weeks. (A) Strength of humoral 
responses induced in immunized BALB/c. Each bar represents the group mean (n=4) for anti-gag lgG1 , anti-gag lgG2 
and total IgG as determined by end-point dilution ELISA assay. End-point titers of the immune sera were defined as 
the reciprocal of the highest plasma dilution that resulted in an adsorbance value (OD 492) three times greater than 
that of a preimmune serum with a cut-off value of 0,05. (B) Cytotoxic T cell activity in splenocytes from mice immunized 
intramuscularly or subcutaneously by particle gun (p.g.) with gag expression vectors. Lymphoid cells obtained from 
mice five days after the boost injection were co-cultured with gag peptide-pulsed syngenic P815 mastocytoma cells 
(irradiated with 20,000 rad). Control assays included splenocytes of non immunized and wt-gag immunized mice stim- 
ulated in vitro with peptide-pulsed P815 cells. Cytotoxic effector populations were harvested after 5 days of in vitro 
culture. The cytotoxic response was read against gag-peptide pulsed A20 cells and untreated A20 negative target cells 
in a standard 51 Cr release assay. Data shown were mean values of triplicate cultures. 

FIG. 5: 

[0023] Maps of the (B) SIV-derived packaging constructs based on optimized GagPol reading frames of SI V and HIV 
and (A) (A) and HIV-1 SIV-derived (B) transgene vectors or packaging constructs. Regions deletedfrom the SIV genome 
are marked by shaded boxes with the first and last deleted nucleotide given after the "A" sign (numbering according 
to Genbank entry M33262). Point mutations that inactivate reading frames are marked by asterisks, inactivated reading 
frames are marked in black. Synthetic GagPol reading frames are hatched. BGHpAd: bovine growth hormone polya- 
denylation signal; CMV: immediate early promoter/enhancer region from human cytomegalovirus; SFFV-Pr: Spleen 
focus forming virus U3 region; GFP: gene for the green fluorescence protein; 

FIG. 6: 

[0024] Immunoblot analysis of cell culture supematants harvested 3 days after transfection of H1299 cells with the 
indicated lentiviral (HIV-1 : Hgp 5 *" and SIV mac239 : Sgp s y°) GagPol expression constructs and proviral DNAs. Released 
GagPol particles were enriched by continuous sucrose density gradient sedimentation. Antigen peak fractions were 
separated on a denaturing 12,5% SDS-PAGE and analyzed by immunobloting using HIV-1 p24 (A) or SIV p27 specific 
(B) monoclonal antibodies. Protease activity and functionality in the polyprotein processing was shown by absence (-) 
or presence (+) of Saquinavir. Molecular weights of the detected precursor proteins (Pr) and mature cleavage products 
are indicated. (C) Additionally, antigen peak fractions were also tested for its content of HIV-1 p24 capsid antigen using 
a commercially available capture ELISA format (DuPont, Boston, USA) 

FIG. 7: 

[0025] Transduction of growth -arrested cells. 293T cells were transfected with ViGABH, pHIT-G and the indicated 
lentiviral GagPol expression plasmids. The MLV vector was generated by transfecting plasmids pLEGFP-N1 , pHIT60, 
and pHIT-G. Vector titers in the supernatant of transfected cells were determined on 293 cells in the presence of the 
indicated concentrations of aphidicolin. The titration was done in triplicates or quadruplicates. The means and the 
standard deviations are shown. GFU: green fluorescence forming units. 

FIG. 8: 

[0026] Detection of replication competent recombinants. CEMxl 74-SlV-SEAP cells were infected with the superna- 
tant of 293T cells co-transfected with SIV-GFPADP (A) or SIV-GFPABP (B) and the indicated gag-pot expression plas- 
mids. RLU: relative light units. 
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FIG. 9:. . 

[0027] Schematical representation of a codon optimized GagPol derivative which can be used for DNA vaccination. 
For safety and regulatory reasons the packaging signal was removed, the integrase deleted, and the reverse tran- 
5 sciptase (RT) gene disrupted by insertion of a scrambled nef gene at the 3'end of the DNA sequence coding for the 
RT active site. The nef gene was dislocated by fusing its 5'half to its 3'half. Myristilation is of the GagPolNef particles 
is inhibited by a Glycin to Alanin mutation at the myristilation site of the gag gene. 

[0028] The term "packaging proteins" as used herein refers to GagPol proteins of retroviruses, spumaviruses, Pref- 
erably, HTLV-1 and 2, more Preferably, all lentiviruses, most Preferably, HIV-1 , HIV-2, SIV, FIV, and in particular genes 
io encoded by SEQ ID NO:2 (GagPol HIV-1 ,„ B ) and SEQ ID NO:1 (GagPol SIV mac239 ) which are capable of packaging 
an RNA containing the packaging site *¥ such as an retroviral, Preferably, lentiviral, most Preferably, HIV-1 , HIV-2, SIN, 
FIV genome and/or a so called "transfer construct" which is specified in detail below. 

[0029] The term "packaging construct" as used herein refers to an expression plasmid or vector system used to 
produce packaging proteins in eucaryotic cells. 
15 [0030] The term "transfer construct" as used herein refers to an expression plasmid or vector system used to produce 
an RNA containing the packaging site the transfer gene and is flanked by LTRs. This RNA is capable to be incor- 
porated within an transducing particle which itself is produced by an eucaryotic cell expressing packaging proteins. 
The LTRs and the packaging site *F is of retroviral, Preferably, spumaviral or lentiviral and most Preferably, of HIV-1, 
HIV-2, SIV or FIV origin. 

20 [0031] The term "vector particle" as used herein refers to a transduction competent viral particle comprising pack- 
aging proteins, envelope proteins as exposed on the surface of the viral particle as well as an incorporated transfer 
construct. 

[0032] The present invention relates to nucleic acid sequences encoding the gag and pol polypeptides, whereby the 
amino acid Ala is encoded by the nucleic acid triplett gcc or get, Arg is encoded by agg or aga, Asn is encoded by aac 

25 or aat, Asp is encoded by gac or gat, Cys is encoded by tgc or tgt, Gin is encoded by cag or cat, Glu is encoded by 
gag or gaa, Gly is encoded by ggc or gga, His is encoded by cac or cat, lie is encoded by ate or att, Leu is encoded 
by ctg or etc, Lys is encoded by aag or aat, Met is encoded by atg, Phe is encoded by ttc or ttt, Pro is encoded by ccc 
or cct, Ser is encoded by age or tec, Thr is encoded by ace or aca, Trp is encoded by tgg, Tyr is encoded by tac or tat, 
Val is encoded by gtg ot gtc and the stop codon is tga or taa. 

30 [0033] Preferably, the present invention relates to nucleic acid sequences encoding the gag and pol polypeptides, 
whereby the amino acrd Ala is encoded by the nucleic acid triplett gcc or get, Arg is encoded by agg or aga, Asn is 
encoded by aac or aat, Asp is encoded by gac or gat, Cys is encoded by tgc or tgt, Gin is encoded by cag or cat, Glu 
is encoded by gag or gaa, Gly is encoded by ggc or gga, His is encoded by cac or cat, lie is encoded by ate or att, Leu 
is encoded by ctg or etc, Lys is encoded by aag or aat, Met is encoded by atg, Phe is encoded by ttc or ttt, Pro is 

35 encoded by ccc or cct, Ser is encoded by age or tec, Thr is encoded by ace or aca, Trp is encoded by tgg, Tyr is encoded 
by tac or tat, Val is encoded by gtg ot gtc and the stop codon is tga or taa, whereby the nucleic acid sequence in the 
region in which the reading frames encoding the gag and pol polypeptides overlap corresponds to the wildtyp nucleic 
acid sequence encoding the gag and pol polypeptides. 

[0034] Most Preferably, the present invention relates to nucleic acid sequences as depicted in SEQ ID NO:1 or 2. 
40 [0035] We were able to demonstrate that consequent and strict codon optimization of the Gag and GagPol reading 
frames derived from the Simian Immunodeficiency virus SIV mac239 or the Human Immunodeficiency virus HIV-1 MIB 
allowed high level expression of Gag, GagPol and derivatives thereof in complete absence of Rev/RRE, CTE and UTR 
which can be used directly for DNA vaccination. 

[0036] Moreover, we were able to demonstrate that consequent and strict codon optimization of SIV mac239 or HIV- 
45 1 iiib GagPol reading frames with the exception where Gag and Pol reading frames are overlapping, allowed high level 
expression of GagPol and derivatives thereof in complete absence of Rev/RRE, CTE and UTR and can be used as 
packaging constructs for gene delivery. 

[0037] Therefore, safe and efficient delivery and expression of Gag or GagPol genes according to the present in- 
vention allows (i) expression of high yields of native proteins in higher eucaryotes like in mammalian or insect cells 

50 and (ii) supporting the generation of transduction competent particles for vaccination and gene therapy purposes which 
is currently limited by the complex regulatory mechanisms of late Gag and GagPol expression. 
[0038] The present invention is based on the observation, that the codon usage in the HIV genome (A/T content 
55.88 %) significantly deviates from that of highly expressed mammalian genes (usually not exceeding 50%) indicating 
a regulatory function of the codon usage for timely regulated lentiviral gene expression. 

55 [0039] Additionally, we were able to show that CTE-mediated, similar to the Rev/RRE mediated, expression of 
wildtype HIV-1 late gene products (such as gag), essentially depends on the presence of the HIV-1 5'-untranslated 
region (UTR) or at least parts of it, which also contains an RNA stretch responsible for packaging of the genomic RNA 
into the viral particle, and therefore touches sensible safety issues. 
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[0040]; In order to eliminate inhibitory or otherwise regulatory sequences within lentiviral Gag and GagPol genes, 
reading frames were optimized according to codon usage found in highly expressed mammalian genes. Preferably,, 
codonsjithat are frequently found in highly expressed mammalian genes (Ausubel et al. 1994 Current Protocols in 
Molecular Biology 2, A1 .8-A1 .9). most Preferably, in highly expressed human genes were used to create a fully synthetic 
reading frame not occurring in nature, which, however encodes the very same product as the original wildtype gene 
construct, For that purpose, a matrix was generated considering almost exclusively those codons that are used most 
frequently and, less Preferably,, those that are used second most frequently in highly expressed mammalian genes as 
depicted in table 1 . 



Table 1: 



Codon most frequently (codon 1) and second most frequently (codon 2) found in highly expressed mammalian 
genes. 


amino acid 


codon 1 


codon 2 


Ala 


GCC 


GCT 


Arg 


AGG 


AGA 


Asn 


AAC 


AAT 


Asp 


GAC 


GAT 


Cys 


TGC 


TGT 


End 


TGA 


TAA 


Gin 


CAG 


CAT 


Glu 


GAG 


GAA 


Gly 


GGC 


GGA 


His 


CAC 


CAT 


lie 


ATC 


ATT 


Leu 


CTG 


CTC 


Lys 


AAG 


AAT 


Met 


ATG 


ATG 


Phe 


TTC 


TTT 


Pro 


CCC 


CCT 


Ser 


AGC 


TCC 


Thr 


ACC 


ACA 


Trp 


TGG 


TGG 


Tyr 


TAC 


TAT 


Vat 


GTG 


GTC 


(Ausubel et al. 1 994 Current Protocols in Molecular Biology 2, A1 .8-A1 .9). 



[0041 ] Based on this matrix, the Wisconsin genetics computer group (gcg) software package was employed to create 
the synthetic reading frame by backtranslating the amino acid sequence of lentiviral Gag and GagPol polypeptides 
into the codon optimized synthetic reading frame. 

[0042] Few deviations from strict adherence to the usage of most frequently found codons may be made (i) to ac- 
commodate the introduction of unique restriction sites at approximately 250 to 300 nt intervals (see e.g. Fig. 2) (ii) to 
break G or C stretches extending more than 7 base pairs in order to allow consecutive PCR amplification and sequenc- 
ing of the synthetic gene product. This may also be performed for the Gag and Pol based expression plasmids used 
for DNA vaccination as well the packaging constructs used for gene transfer with exception of the latter where gag and 
poi reading frames are overlapping. In particular it is preferred that the overlapping region of the gag and po/ reading 
frames of the packaging constructs used for gene transfer may not be altered at position 1298 - 1558 of SEQ ID NO: 
2 and position 11 75 - 1532 of SEQ ID NO:1 , in order to maintain both reading frames. 
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[0043] In particular, the use of synthetic Gag and Pol genes comprising the GagPol genes of all Retroviruses, Human 
T-cell Leukemia Virus 1 and 2, Spuma- and Preferably, Lentiviruses, in particular derived from HIV and SIV as encoded 
by SEQ ID NO:2 (GagPol HIV-1 „ IB ) and SEQ ID NO:1 (GagPol SIV mac239 ) allows therefore the safe and efficient 
production of packaging proteins for delivering genetic information to cells, whereas codons of Gag and GagPol reading 
5 frames were adapted to highly expressed mammalian genes as specified above, except regions where gag and pol 
reading frames are overlapping. 

[0044] The present invention allows therefore the safe and efficient production of packaging proteins by increasing 
the yields of GagPol synthesis as compared to the expression rates achieved from the wildtype gene or wildtype like 
genes characterized by clustered point mutations - if driven by the same cellular or viral promotor/enhancer unit - as 
to a result of consequent adaptation of the codon usage to that of frequently expressed mammalian genes. 

[0045] The present invention allows the safe and efficient production of packaging proteins in complete absence of 
known cis-acting elements (UTR, RRE) or transactive proteins such as Rev and Tat. 

[0046] The present invention further relates to retroviral gag or gagpol based particles generated by nucleic acid 
molecules comprising the nucleic acid sequence according to the invention having at least a 25 fold reduced incorpo- 
15 ration rate of said nucleic acid molecules encoding the gag and pol polypeptides compared to retroviral particles gen- 
erated by wildtyp nucleic acid sequences encoding the gag and pol polypeptides. 

[0047] The present invention further relates to retroviral gag or gagpol based particles produced by nucleic acid 
molecules comprising the nucleic acid sequence according to the invention having at least a 100 fold reduced recom- 
bination rate between said nucleic acid molecules encoding the gag and pol polypeptides and a wildtype genome 
20 based transgene construct derived from the same or another retrovirus. 

[0048] The present invention allows the efficient production of safe vector particles by reducing the opportunities for 
reconstituting infection competent hybrid retroviral particles by recombination events between nucleic acids encoding 
the packaging proteins and sequences encoding endogenous retroviruses. 

[0049] The present invention allows the efficient production of safe vector particles by reducing the risk for incorpo- 
25 rating the RNA species encoding the packaging proteins into vector particles by more than 25 fold as compared to 
expression constructs containing the 5'-UTR or encoding the wildtype GagPol gene or by diminishing the risk of pack- 
aging the very same RNA into endogenous retroviral particles. 

[0050] The present invention allows the efficient production of safe vector particles by reducing the risk of recombi- 
nation between the packaging construct and the gene transfer construct by more than 1 00 fold, by combining a synthetic 
30 gene encoding the GagPol packaging function from one retro- spuma or lentivirus with the wildtype genome based 
transgene construct derived from (i) the very same or (ii) another retro-, spuma- or lenti- virus. 

[0051] More specifically, the present invention allows the efficient production of safe vector particles by combining 
the synthetic gene encoding the HIV-1 derived packaging proteins with the wildtype HIV-1 genome based transgene 
construct or by combining the synthetic gene encoding the SIV mac239 derived packaging proteins with the wildtype 

35 SIV mac2 3 9 genome based transgene construct or combinations thereof. 

[0052] Preferably,, by combining the synthetic gene encoding the HIV-1 derived packaging proteins with the wildtype 
SIV mac239 genome based transgene construct or by combining the synthetic gene encoding the SIV r ^ ac2 39 derived 
packaging proteins with the wildtype HIV-1 genome based transgene construct, thereby reducing homologies between 
transfer vector and packaging vector (as e.g. depicted in table 1 of example 4) resulting in safe and efficient packaging 

40 constructs being useful for the production of safe lentiviral vectors and gene delivery systems. 

[0053] Synthetic GagPol genes may or may not encode the active integrase. In the latter case, the integrase gene 
may be deleted in total. Alternatively, the enzyme activity may by knocked out by one deletion or several short contin- 
uous in frame deletions of various length, Preferably, not exceeding 5 amino acids and, more Preferably,, comprising 
a single codon. Also, the integrase may be rendered inactive by point mutations at various sites, Preferably, by a single 

45 point mutation (e.g. position 3930 - 3932 in SEQ ID No:2 and position 3957 - 3959 in SEQ ID No:1 are changed to 
glutamin encoding nucleotides). 

[0054] The present invention furthermore relates to retroviral packaging cells for the generation of the retroviral 
particles according to the invention transformed with nucleic acid molecules comprising the nucleic acid sequence 
according to the invention. 

50 [0055] GagPol packaging proteins according to the present invention may be stably, inducibly or transiently ex- 
pressed in any primary or lab adapted cell or cell line of human, mammalian or non-mammalian origin. The synthetic 
GagPol genes may be episomal or chromosomal within these cells and may contain additional recombinant genes, 
preferable any specific f usogenic gene, more preferable any viral envelope gene, more preferable a retroviral envelope 
gene. These cells may also contain either an episomal or chromosomal transfer vector, but may not contain any ad- 

55 ditional viral wildtype sequences. 

[0056] GagPol expression may be driven by any tissue- or cell type-specific promotor/enhancer such as muscle 
kreatin kinase or MHC class I promotor/enhancer or by any viral e.g. CMV immediate early, Rous Sarcoma Virus 
promotor/enhancer, Adenovirus major/late promotor/enhancer or nonviral e.g. synthetic promotor/enhancer. 
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[0057] . . GagPol expression may be either constitutive or inducible e.g. by addition of a specific inducer like ecdysone 
in the ease of a hormone inducible promotor enhancer or by removal of a repressor protein (Tet on/off) or repressor 
gene (Cre/Lox). 

[0058] . GagPol based gene delivery into cells may be mediated by any means of transfecting or introducing a plasmid 
5 or plasmid derivative such as "midges" (closed linear DNA) encoding the synthetic GagPol gene or the above variants 
theredf into cells. 

[0059] GagPol based gene delivery into cells may be also mediated by infectious recombinant (i) viral vectors such 
as recombinant Retroviruses, Vaccinia- or Poxviruses, Adenoviruses, Alphaviruses, Herpesviruses, Baculoviruses or 
any other recombinant virus, (ii) subviral components bridging transfection and infection procedures like e.g. Virosomes 
10 comprising nucleic acid/protein aggregates containing e.g. Influenca hemagglutinin, (iii) bacterial vectors such as re- 
combinant Listeriae or Salmonellae or any other type of infectious, replicating or non-replicating vector. 
[0060] Any means of introducing GagPol genes and derivatives thereof into cells may lead to a transient expression 
of GagPol. The introduction of the above indicated GagPol genes into cells may also allow the establishment of stable 
cell lines supporting either constitutive or inducible GagPol expression. 
15 [0061] Retroviral, Preferably, Spumaviral and Lentiviral and, more specifically, HIV or SIV derived vector particles 
carrying the transgene RNA and being capable of transducing nondividing, resting or postmitotic cells of various species 
in addition to dividing or proliferating cells may be generated by co-expressing a synthetic GagPol constructs according 
to the invention together with any appropriate amphotropic envelope like the vesticular stomatitis virus G protein, xeno- 
tropic envelope like the env protein of avian or murine sarcoma viruses e.g. the Rous sarcoma virus (RSV) Env protein, 
fo or cell type specific receptor protein like the HIV-1 Env protein, in the presence of any appropriate gene transfer con- 
struct encoding the desired packaging competent transgene RNA. 
[0062] Transduction competent vector particles may be generated by co-transfecting or co-infecting or transfecting/ 
infecting primary cells or cell lines of various origin such as mammalian Verocells, HeLa, Cos, CHO or H1299 cells or 
insect SF9 f High-5 or DS-2 cells with vector constructs encoding the synGagPol derived packaging proteins, and any 
25 appropriate envelope or receptor structure as well as the desired transgene. 

[0063] Alternatively, transduction competent vector particles may be generated by stable cell lines expressing GagPol 
from the synthetic GagPol gene according to the invention, either inducible or constitutrvely, whereby the stable cell 
lines may be co-transfected or co-infected or transfected/infected with vector constructs encoding any envelope or 
receptor structure as well as the desired transgene. 
30 [0064] Preferably, , transduction competent vector particles may be generated by stable cell lines expressing GagPol 
from the synthetic GagPol according to the invention gene together with any appropriate envelope or receptor structure, 
either inducible or constitutive ly, whereby the stable cell lines may be transfected or infected with vector constructs 
encoding the desired transgene. 

[0065] Ideally, transduction competent vector particles may be obtained from a stable cell line mediating the expres- 
35 sion of the synthetic GagPol gene and any appropriate receptor or envelope function together with the generation of 
the packaging competent transgene RNA, either in an inducible manner or constitutively or a combination thereof. 
[0066] Most Preferably,, transduction competent vector particles are generated by transfection of cells with a plasmid 
or derivative thereof or by infecting cells with any infectious, although not necessarily replication competent vector 
encoding the synGagPol derived packaging proteins, and any appropriate envelope or receptor structure as well as 
40 the desired transgene. 

[0067] Both transduction competent vector particles, as well as infectious, although not necessarily replication com- 
petent vectors encoding the synGagPol derived packaging proteins, and any appropriate envelope or receptor structure 
as well as the desired transgene may be used for application ex vivo and in vivo, 

[0068] Furthermore, the present invention relates to nucleic acid molecules comprising the nucleic acid sequence 
45 according to the invention for use as an active pharmaceutical substance. 

[0069] The present invention relates further to the use of nucleic acid molecules comprising the nucleic acid sequence 
according to the invention for the preparation of a pharmaceutical composition for the treatment and prophylactic of 
diseases caused by retroviruses. 

[0070] Moreover, the present invention can be used for vaccination purposes, in particular, if used as nucleic acid 
so based vaccination constructs, which have been shown to elicit a strong humoral and Th-1 like cellular immune response 
in mice. On the basis of the GagPol genes of the present invention it is now possible to construct effective retroviral, 
HTLV-1 and -2 derived as well as lentiviral and in particular HIV and SIV based Gag and GagPol vaccines comprising 
ail known HIV and SIV clades and sequences as well as derivatives thereby increasing the yields of GagPol production 
by the use of codon optimized GagPol genes or derivatives thereof as compared to the expression rates achieved from 
55 the wildtype gene or wildtype like genes characterized by clustered point mutations - if driven by the same cellular or 
viral promotor/enhancer unit - as a result of consequent adaptation of the codon usage to that of frequently expressed 
mammalian genes 

[0071] A preferred object of the present invention is the construction of GagPol based vaccines and in particular HIV 
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and SIV based vaccines, especially modified also to improve safety of GagPol derived candidate vaccines if delivered 
e.g. as a DNA vaccine, by any kind of infectious, replicating or nonreplicating vector or if administered as GagPol 
derived virus-like particles by consequent and strict codon usage adaptation in complete absence of known cis-acting 
elements or transactive proteins such as Rev and Tat. 

5 [0072] Another preferred object of the present invention is the construction of GagPol based vaccines by reducing 
the opportunities for reconstituting infection competent hybrid retroviral particles by recombination events between 
nucleic acids encoding the packaging proteins and sequences encoding endogenous retroviruses. 
[0073] Another preferred object of the present invention is the construction of GagPol based vaccines by reducing 
the risk for incorporating the RNA species encoding Gag, GagPol or derivatives thereof into subviral Gag or GagPol 

10 derived particles by more than 25 fold as compared to expression constructs containing the 5*-UTR or encoding the 
wildtype Gag and GagPol derived genes and by diminishing the risk of packaging the very same RNA into endogenous 
retroviral particles thereby reducing the risk of recombination between the nucleic acid derived from a vaccine construct 
encoding Gag, GagPol or derivatives thereof according to the present invention with DNA or RNA derived from an 
infection by HIV. SIV or any lenti- or retrovirus or otherwise related virus. 

is [0074] This results in safe and efficient DNA constructs being useful for the production and delivery of safe GagPol 
derived vaccine constructs. 

[0075] In order to further increase the safety of the vaccine constructs, codon optimized Gag or GagPol reading 
frames may be modified at distinct sites. 

[0076] Preferably, the Gag or GagPol based synthetic reading frame is modified in such a way that the myristilation 
20 of Gag-precursors is inhibited (e.g. position 4 - 6 in SEQ ID No:2 and position 4 - 6 in SEQ ID No:1 are changed to 
alanin or valin encoding nucleotides), thereby preventing budding of Gag or GagPol derived particles and increasing 
the amount of endogenously produced antigen. 

[0077] Preferably, the Gag or GagPol based synthetic reading frame is modified in such a way that the Gag and Pol 
coding regions are using the same reading frame by introducing a ribosomal frameshift by either introducing one nu- 
25 cleotide or deleting two nucleotides at any position, without creating a premature stop codon but within the Gag coding 
region Preferably, between the position of the natural frameshift target site (slippery sequence; corresponding to po- 
sition 1297 SEQ ID No:1 and position 1294 SEQ ID No:2) and the Gag stop codon (specified by SEQ ID No:3), thereby 
increasing the amount of synthesized pol gene products. 

[0078] Preferably, the Gag or GagPol based synthetic reading frame is modified in such a way that the protease 
30 activity is rendered inactive by a deletion comprising the complete or part of the PR gene or by one or several short 
continuous in frame deletions of various length, Preferably, not exceeding 5 amino acids and more Preferably, com- 
prising a single codon, most Preferably,, encoding the aspartic acid at the PR active site. Also, the PR may be rendered 
inactive by point mutations at various sites, Preferably, by a single point mutation any position most Preferably, at the 
active site (e.g. position 1632-1635 in SEQ ID No:2 and position 1575- 1577 in SEQ ID No:1 are changed to asparagin 
35 or alanin encoding nucleotides), 

[0079] Preferably, the Gag or GagPol based synthetic reading frame is modified in such a way that the active site of 
the reverse transcriptase gene (RT) is rendered inactive by a deletion comprising the complete or part of the RT gene 
or one or several short continuous in frame deletions of various length, Preferably, not exceeding 5 amino acids and 
more Preferably, comprising a single codon encoding an amino acid at any position, most preferable within the active 
40 site being critical for RT function. Also, the RT may be rendered inactive by point mutations at various sites, Preferably, 
by a single point mutation at the active site (e.g. position 2349 - 2351 in SEQ ID No:2 and position 2136 - 2138 in SEQ 
ID No:1 are changed to asparagin or glutamic acid encoding nucleotides). 

[0080] Preferably, the Gag or GagPol based synthetic reading frame is modified in such a way that the active site of 
the reverse transcriptase gene (RT) is dislocated e.g. at the very C-terminus of the synthetic GagPol gene in order to 
45 inhibit polymerase activity thereby preventing the production of potentially hazardous nucleotide sequences and/or 
reconstitution of infectious or by other means hazardous particles. 

[0081 ] Preferably, the Gag or GagPol based synthetic reading frame is modified in such a way that the other lentiviral 
genes such as nef, rev, tat, which are rendered biological inactive (e.g. by gene scrambling), are inserted into selected 
parts of the Gag or GagPol reading frames, Preferably, into the active site of RT in order to broaden immunogenicity 
50 and safety. 

[0082] Preferably, the Gag or GagPol based synthetic reading frame is modified in such a way that the potentially 
hazardous genes, like the integrase (IN) are deleted. Alternatively, the enzyme activity may by knocked out by one 
deletion or several short continuous in frame deletions of various length, Preferably, not exceeding 5 amino acids and 
more Preferably, comprising a single codon. Also, the integrase may be rendered inactive by point mutations at various 
55 sites, Preferably, by a single point mutation (e.g. position 3930 - 3932 in SEQ ID No:2 and position 3957 - 3959 in SEQ 
ID No:1 are changed to glutamin encoding nucleotides). 

[0083] Alternatively, the Gag or GagPol based synthetic reading frame is modified by any combination of the above 
described possibilities. An example of a HIV-1 GagPol derivative modified as described above, is depicted in Fig. 9 
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and specified in SEQ ID No: 4. 

[0084] Strict adherence to the codon usage of highly expressed mammalian genes also leads to an increase of the 
overall GC, GG and CC content and thereby introduces potentially immunomodulatory nucleic acid motifs that increase 
humoral and cellular immune responses at least 2-fold as compared to the corresponding wildtype gene or wildtype 
like genes that were rendered partially Rev independent by clustered point mutations. 

[0085] The GagPol product or derivatives thereof encoded by the gagpol genes of the present invention may be 
stably, inducibly or transiently expressed in any primary or lab adapted cell or cell line of human, mammalian or non- 
mammalian origin in order to produce GagPol antigens, GagPol containing virus-like particles or derivatives thereof. 
[0086] The expression of codon optimized GagPol or derivatives thereof in cell culture as well as in vivo following 
imrnunization with DNA or vector constructs specified below may be driven by any tissue- or celltype-specif ic cellular 
promotor/enhancer such as muscle kreatin kinase or MHC class I promotor/enhancer or by any viral (e.g. CMV imme- 
diate early; Rous Sarcoma Virus promotor/enhancer, Adenovirus major/late promotor/enhancer) or nonviral e.g. syn- 
thetic promotor/enhancer. 

[0087] GagPol expression may be either constitutive or inducible e.g. by addition of a specific inducer such as ecdys- 
one in the case of a hormone inducible promotor enhancer or by removal of a repressor (Tet on/off) or repressor gene 
(Cre/Lox). 

[0088] The delivery of codon optimized GagPol genes or derivatives thereof into cells in vitro may be mediated by 
any means of transfecting or introducing a plasmid or plasmid derivative such as "midges" (closed linear DNA) encoding 
the synthetic GagPol gene or the above variants thereof into cells. 

[0089] The delivery of codon optimized GagPol genes or derivatives in vivo may be mediated by injection of the 
plasmid, plasmid derivative or infectious, replicating or non replicating vector into any site, Preferably, intramascularly, 
subcutaneously or intradermally, co-administered with any kind of adjuvant such as liposomes, ISCOMS, alum or via 
biodegradable particles, either directly or using technical devices like particle gun, biojector or by any other means. 
[0090] The delivery of codon optimized GagPol genes or derivatives into cells in vitro and in wVomay be also mediated 
by infectious recombinant viral vectors such as recombinant Retroviruses, Vaccinia- or Poxviruses, Adenoviruses, 
Alphaviruses, Herpesviruses, Baculoviruses or any other recombinant virus, subviral components bridging transfection 
and infection procedures like e.g. Virosomes comprising nucleic acid/protein aggregates containing e.g. Influenca he- 
magglutinin, or bacterial vectors such as recombinant Listeriae or Salmonellae or any othertype of infectious, replicating 
or non-replicating vector. 

Example 1 : 

[0091] Construction of the synthetic gag gene. AH subsequent numbering of HIV-1 wildtype nucleotide sequences 
correspond to the HIV-1 isolate BH10 (GenBank Accession: M15654). All subsequent numbering of synthetic HIV-1 
Gag encoding reading frames correspond to the start codon of the respective coding region. Position of restriction sites 
are defined by their cleavage site. 

[0092] A synthetic sequence coding for the HIV-1 mB PrSS^s potyprotein was generated by translating the amino- 
acid sequence of the gag coding region (nucleotides 112 - 1650) into a synthetic-gag (syngag) coding sequence using 
a codon-usage occurring most frequently in mammalian cells as depicted in Seq ID No: 5. In order to fragmentize the 
syngag reading-frame, unique restriction sites were generated at positions 5 (A/a/1), 290 (Sfyl), 487 (Stu\) t 696 (Sadl), 
852 (EcoRV), 1107 (Bs/EII) and 1307 (Bgli) by silent mutations. For cloning, additional restriction sites Kpn\ (95), 
BaniHl, Xho\ and Sad were introduced within non-coding regions. Synthetic fragments spanning regions between 
restriction sites Kpn\/Sty\ (F1), Sty\/Stu\ (F2), StuUEcoRV (F3), EcoBV/BstEU (F4), BstE\vBgll\ (F5) and Bgn/Sad (F6) 
were created by stepwise PCR-amplification with overlapping 60 nt long oligonucleotides and subcloned using the 
pCR-Script™ Amp SK(+) Cloning Kit (Stratagene, Heidelberg, Germany) following the manufactures instructions. The 
unique restriction sites Kpn\ t Sty\, Stu\, EcoRV, BsfEII, Saul and Sad were used to assemble the fragments into a 
complete reading frame. Finally, full-length syngag was placed into the Kpn\ and Xhol restriction sites of of pcDNA3.1 
(+) expression vector (Invitrogen, Leek, The Netherlands) under the transcriptional control of the immediate-early pro- 
moter-enhancer of the Cytomegalovirus (CMV) allowing constitutive transcription in mammalian cells. 
[0093] In order to eliminate inhibitory sequences within the HIV-1 gag reading frame, a synthetic gene was construct- 
ed, as described above, encoding the entire Pr559 a 9 polyprotein via a codon-usage occurring most frequently in highly 
expressed mammalian genes. Few deviations from strict adherence to the optimized codon usage were made to ac- 
commodate the introduction of unique restriction sites at approximately 250 to 300 nt intervals (Seq ID No: 5). A com- 
parison of synthetic and wildtype Gag coding sequences is shown in Fig. 2 A demonstrating that almost every wobble 
position within the wildtype coding region was altered by construction of the synthetic gene. The Wisconsin genetics 
computer group (gcg) software package was used to compare the varying composition of the wildtype and synthetic 
gag reading frame. Codon optimization resulted in an increased G/C and decreased A/U(T) content, as well as in a 5 
fold increase in immune stimulatory CpG motifs (Fig. 2 B). 
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[0094] . Several wildtype expression vectors were established by means of standard molecular biology techniques in 
orderto compare expression of synthetic and wildtype gag reading frames. Subgenomic HIV-1 wildtype sequences 
were cloned by PCR amplification or restriction digestion of HX10 provirus DNA (Ratner etal. 1987, AIDS Res. Hum. 
Retroviruses.' 1987. 3:57-69). The wildtype gag sequence (nucleotides 9-1640) including a 5* -located 103bp long UTR 

5 was cloned into the Kpn\ and Xho\ restriction sites of pcDNA3.1 (Stratagene) expression vector after PCR amplification 
of a 1 667 nt fragment with primers utr-1 (5'-gcg GGT ACC GAA TTC cga cgc agg act egg ctt gc-3') and gag-2 (5'-gcc 
GAG CTC CTC GAG GGA TCC tta ttg tga cga ggg gtc gtt gec aaa gag-3') resulting in pCMV-UTR-wtgag. Wildtype 
gag was cloned into the Kpn\ and Xho\ restriction sites of pcDNA3.1 (Stratagene) expression vector by PCR amplifi- 
cation of a 1537 nt fragment (103-1640) with primers gag-1 (5*-gcg GGT ACC GAA TTC agg aga gag atg ggt gcg aga 

10 gcg tea gta tta agc-3') and gag-2 resulting in pCM V-wtgag. PCR amplification of the UTR with primers utr-1 and utr-2 
(5'-gga tG GCG CCc ate tct etc ctt eta gee tcc-3') resulted in a 103 nt fragment (9-112) which was cloned into the Kpn\ 
and A/art restriction sites of pCMV-syngag, thereby placing the UTR directly 5' of the start-codon of syngag resulting 
in pCMV-UTR-syngag. BglU and BamHl digestion of HX1 0 proviral DNA released a 854 nt fragment (6976-7830) con- 
taining the RRE and an inefficiently used splice acceptor site (pos. 7734). This RRE containing fragment was cloned 

15 into the BamHl restriction site of the plasmids pCM V-UTR-wtgag, pCMV-wtgag, and pCMV-UTR-syngag resulting in 
pCMV-UTR-wtgag-RRE, pCMV-wtgag-RRE and pCMV-UTR-syngag-RRE, respectively. A DNA fragment coding for 
the constitutive transport element (CTE) RNA element of Simian Mason-Pfizer D-type retrovirus (MPMV, Accession 
Nr. M12349, nucleotides 7886 to: 8386), was amplified using MPMV proviral DNA as template and primers cte-1 (5 1 
get aGG ATC Ccc att ate ate gec tgg aac 3') and cte-2 (5' cga aCT CGA Gca aac aga ggc caa gac ate 3'). The 500 bp 

20 fragment was cloned into the BamHl and Xho\ restrction sites of pCMV-wtgag and pCMV-UTR-wtgag resulting in con- 
structs pCMV-wtgag-CTE and pCMV-UTR-wtgag-CTE respectively. A schematic representation of all constructs is 
summarised in Fig. 1 A. 

Example 2: 

25 

[0095] Transfection of Gag encoding expression constructs into mammalian ceils. Cells were transfected by the 
calcium coprecipitation technique(Graham and Van der EB 1973, Virol. 52, 456-467). The day prior to transfection 
2x1 0 6 cells were plated on 100 mm-diameter culture dishes and incubated for 24 hours. For transfection, 30p.g of 
Pr559 a 9 expressing plasmids was used and the total amount of DNA adjusted to 45uxj with either pcDN A3. 1 (+) vector 

30 or for co-transfection experiments with pCMV-rev. Cells were harvested 48 hours post transfection, washed two times 
in PBS and then further analyzed. Western blot analysis of cells transfected with Gag encoding expression constructs. 
Harvested cells were lysed in 0.5% Triton X-100, 100mM Tris/HCI (pH7.4) subjected to repeated freeze/thaw cycles, 
cleared by centrifugation (20800 x g for 5 min.) and the total amount of protein measured by BIO-RAD Protein-Assay 
(Bio-Rad Laboratories, Munich, Germany) following the manufactures instructions. 50jig total protein were separated 

35 by electrophoresis on a denaturing SDS 12.5% polyacrylamide gel and transferred onto nitrocellulose membrane by 
electroblotting. Expression of Pr559ag was detected by a HIV-1 p24 specific monoclonal antibody 1 3-5 (Wolf etal. 1990, 
AIFO 1 :24-29) and visualized by chromogenic staining. 

[0096] Gag capture-ELISA of cells transfected with Gag encoding expression constructs. The p24 capture-ELISA 
antibodies were kindly provided by Dr. Matthias Niedrig (RKI, Berlin, Germany). 96-well MaxiSorp ELISA plates (Nunc, 

40 Wiesbaden, Germany) were coated over night a with 100uJ of p24-specific monoclonal antibody 11-G-7 diluted 1:170 
with 0.1 M carbonat-buffer (pH 9.5). The plates were washed 6 times with wash-buffer (0.1% Tween 20, 300 mM NaCI, 
10 mM Na 2 HP0 4 2H 2 0, 1.5 mM NaH 2 PQ 4 H 2 0). Different amounts of cell-lysats were diluted in 0.5% BSA in wash- 
buffer added to the wells and incubated over night at 4°C with a second horse- radish-peroxidase conjugated mono- 
clonal antibody (diluted 1:600 in 0.5% BSA in wash-buffer), recognizing a different epitope within p24. After washing 

45 the plates 6 times with wash-buffer, antibody conjugates were stained with OPD-solution (Abbott, Wiesbaden, Germa- 
ny) and absorption OD (495 nm) measured with an ELISA Reader. The concentration of Pr559 a 9 was determined by 
a calibration curve using different concentrations of purified Pr559*9, produced in insect cells using the baculovirus 
expression system (Wagner et al. 1 994, Virology 200:1 62-1 75). 

[0097] As expected, synthesis of Pr559 a 9 from the wildtype gag gene (wtgag) was extremely low in absence of RRE 
so and Rev (Fig. 3 A, lanes 5,6). However, following fusion of RRE downstream of wtgag gene and co-transfection of Rev, 
Pr559*9 production increased only by factor of 1 .5-2, suggesting that the Rev/RRE system perse was not sufficient to 
promote substantial Gag expression (Fig. 3 A, lanes 7, 8). In contrast, addition of the authentic 103 bp UTR 5' to the 
gag gene resulted in a 4-5 fold increase in protein production even in the absence of Rev (Fig. 3 A, lane 1, 3 and 4). 
Additionally, Rev/RRE interaction led to a further increase in Pr559ag production by a factor of 5-8 (2-4 ng/|ig total 
55 protein; Fig. 3 A, lane 2) and by several orders of magnitude compared to wtgag-RRE co-transfected with Rev. Sub- 
stitution of Rev/RRE by CTE-mediated nuclear export depended in very same way on the presence of the authentic 
UTR resulting on high level Gag expression.. However, CTE mediated Pr559ag production (Fig. 3 B, lane 3) reached 
only 70% - 80% (1,3 -1,6 ng/u.g total protein) of the Rev-dependent Gag expression using Rev/RRE. (Fig. 3 B, lane 
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1), whereas in absence of UTR or only very little Gag was produced (Fig. 3 B, lanes 4, 5). 

[0098] V High level expression of Pr559 a 9 was achieved in all cases after transient transfection of the syngag encoding 
plasmids ; (Fig..3 C lane 1-4). Expression levels were not substantially altered by introducing the Rev/RRE system and 
was neither dependent nor influenced by the presence of UTR (Fig. 3 C lanes 3-4). Pr559 a 9 expression levels exceeded 
5 those accomplished by CTE-mediated wtgag expression by more than 130%, and by about 50% to 1 00% by co-trans- 
fectioh of UTR-wtgag-RRE with Rev (Fig. 3 C, lane 6) and by orders of magnitude to those accomplished by wtgag- 
RRE whether or not Rev was co-transfected (Fig. 3 C, lane 7, 8). Therefore, codon usage adaptation allows Rev- 
independent expression in absence of any cis-acting regulatory elements resulting in high yields of Pr559 a 9 production 
(3.5-6.5 ng/p,g total protein). 

10 [0099] These results were confirmed using various cell-lines (Cos7 t HeLa) ruling out that cell-type specific factors 
critically contributed to the observed effects (Fig. 3 D). 

Example 3: 

15 [01 00] Immunogenicity of Gag expression vectors after naked DNA vaccination. Female BALB/c mice (C hartes River, 
Sulzfeld, Germany) were housed under specific pathogen-free conditions and injected at the age of 6 to 12 weeks. In 
order to evaluate the efficiency and type of immune response induced by either wildtype or synthetic reading frames 
after DNA vaccination, groups of each 3 mice received a subcutaneous (s.c.) primary injection, at the base of the tail, 
of a solution containing 1 00 u,g plasmid DNA in a total volume of 1 00 PBS. At week 3 and 6 after the primary injection, 
mice received a boost injection, respectively. 
/ [01 01] Serum was recovered from mice by tail bleed at week two after the boost injection. Antibodies specific for the 
HIV Pr559ag polyprotein were quantified by an end-point dilution ELISA assay (in duplicate) on samples from individual 
animals. In brief, a solid phase of PrepCell purified Pr559 a 9 proteins (1 00 uJ of 1 u.g/ml per well, overnight at 4° C in a 
refrigerator) was used to capture Pr559 a 9- reactive antibodies from the serum (2 h at 37°C), which were then detected 
25 with horseradish peroxidase (HRP) -conjugated goat anti-mouse lgG1 , lgG2a and total igG antibodies (1 :2000 in PBS, 
2% Tween 20, 3% FCS; 1 00 jil/well; PharMingen, Hamburg, Germany), followed by o-phenlyenediamine ^hydrochlo- 
ride solution (100 uJ/well, 20 min at room temperature in the dark; Abbott Laboratories, Abbott Park, IL). End-point 
titers were defined at the highest serum dilution that resulted in an adsorbance value (OD 492) three times greater 
than that of the same dilution of a nonimmune serum. The serum of each mouse was assayed and these values were 
30 used to calculate the mean and standard deviation for each group of five mice. Single cell suspensions were aseptically 
prepared from spleens of mice 5 days afterthe boost immunization. Cells were suspended in a-MEM medium (Gibco) 
supplemented with 10 mM HEPES buffer, 5 x 10* 5 p-mercaptoethanole and 10% FCS. 5% of a selected batch of Con 
A-stimulated rat spleen cell supernatants (Poggi et a/. 1994, Eur J Immunol. 1994 Sep;24(9):2258-61 ) were further 
added to the culture medium as a source of growth factors. Responder cells (3 x 10 7 ) were cocultured with 1 .5 x 10 6 
35 syngenic, N/3^ peptide-pulsed P815 cells (irradiated with 20,000 rad) in 10 mi tissue culture medium in upright 25 
cm 3 tissue culture flasks in a humidified atmosphere with 7% C0 2 at 37°C. Cytotoxic effector populations were har- 
vested after 6 days of in vitro culture. Serial dilutions of effector cells were cultured with 2 x 1 0 3 target cells in 200 jxt 
round-bottom wells. Targets were autologous A20 cells (2 x 1 OVml) incubated overnight at 37°C with 1 0~ 8 M of a 1 8-mer 
V3 lai peptide or a 23-mer p24CA LA | peptide. Non peptide-pulsed cells were used as a negative control. Target and 
)40 control A20 cells were labeled with 51 Cr (1 hour at 37°C, 20 u.Ci/10 6 cells) prior to being added to the effector cells. 
After a 4 hour incubation at 37°C, 1 00 u.l of supernatant were collected for ^counting. The percentage specific release 
was calculated as [(experimental release-spontaneous release) / (total release-spontaneous release)] x 100. Total 
counts were measured after adding 1% Triton X-100 to the labeled target cells. Spontaneously released counts were 
always less than 20% of the total counts. Data shown are the mean of triplicate cultures. Standard errors of the mean 
45 of triplicate data were always less than 20% of the mean. 

[0102] As depicted in Fig. 4, synthetic genes derived from lentiviral pathogens are capable of inducing a strong 
humoral and cellular immune response. Only in mice immunized with synthetic gag plasmids high levels of specific 
antibodies could be detected. A Th-1 response induces more likely a cellular immune response, which was verified by 
CTL chromium release assay of freshly prepared spleen cells of the immunized mice. Again, only mice immunized 
50 with a Gag expression plasmid encoded with an optimized codon usage was capable of inducing a strong cytolytic 
activity and therefore cell mediated immune response. 

Example 4: 

55 [01 03] Construction of synthetic genes coding for packaging functions of a fentivirai gene transfer vector system. All 
subsequent numbering of nucleotide sequences referring to synthetic GagPol sequences of HIV-1 or SIV correspond 
to the start codon of the respective coding region. Position of restriction sites are defined by their cleavage site. 
[01 04] Design of a codon optimized HiV- 1 GagPoi gene encoding packaging functions. A synthetic sequence coding 
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for the H!V-1 MIB (BH10, GenBank Accession: M15654) Pr160 Ga 9 p °» polyprotein was generated by backtranslating the 
amino-acid sequence of the GagPot coding region (nucleotides 1 - 4343), except for the region where gag and pot 
reading frames are overlapping (nucleotides 1298 - 1539), and by using a codon-usage occurring most frequently in 
mammalian cells (as depicted in Seq ID No:2). In order to fragmentize this synthetic GagPol reading-frame (Hgpsy"), 
5 restriction sites were introduced at positions 290 (Sty\), 487 (Stul), 1056(PflMI), 1107 (Bs/EII), 1549 (Bali), 1852 (Saul), 
2175 {Acc\) t 2479 (BsfXI), 2682 (EcoNl), 2867 (Spnl), 3191 (PflMI), 3370 (BcA), 3588 (Bali), 3791 (Xmalll), and 4105 
(Sty\) by silent mutations. For cloning, additional restriction sites Kpn\ t Xhol and Sad were introduced within non- 
coding regions. Synthetic fragments spanning regions between restriction sites Kpn\/Sty\ (F1), Sty\/Stu\ (F2), Stul/ 
BstEU (F3), BsfE\\/Bal\ (F4), Ba/l/Saul (F5), Saul/Acci (F6), Acc\/Bs(X\ (F7), BstX\/EcoN\ (F8), EcoN\/Sph\ (F9), SphV 

10 PfM\ (F10), Pfm/Bch (F11), Bcn/Bai\ (F12) Bai\/Xma\\\ (F13), Xma\\\/Sty\ (F15) and Sty\/Sac\ (F16) were created by 
a stepwise PCR-amplification of overlapping 60 nt long oligonucleotides and subcloned using the pCR-Script™ Amp 
SK(+) Cloning Kit (Stratagene, Heidelberg, Germany) following the manufactures instructions. The unique restriction 
sites Kpn\, Sty\, Stu\, BstEU, and Sad were used to assemble the fragments F1 to F4 into one intermediate fragment 
(F1-4). The unique restriction sites Kpn\, Baft, SaiA and Sad were used to assemble the fragments F5 to F7 into a 

is second intermediate fragment (F5-7). The unique restriction sites Kpn\, BsfXl, EcoNl, Sph\, PfMl, Ben, Bali, Xmalll, 
Sty\ and Sad were used to assemble the fragments F8 to F16 into a third intermediate fragment (F8-16). The inter- 
mediate fragments (F1-4, F5-7 and pF8-16) were finally assembled into the full-length synthetic GagPol reading frame 
using restriction sites KpnUBstEU, BstE\\/Acc\, and Acc\/Sac\ respectively, and placed into the Kpn\ and Sad restriction 
sites of the pCR-Script™ Amp SK(+) cloning vector (Stratagene, Heidelberg, Germany). In order to obtain high level, 

20 constitutive expression in mammalian cells the synthetic GagPol coding region was placed into the Kpn\ and Xho\ 
restriction sites of pcDNA3.1 (+) expression vector (Invitrogen, Leek, The Netherlands) under the transcriptional control 
of the immediate-early promoter-enhancer of the Cytomegalovirus (CMV) resulting in plasmid Hgp s y" (encoding HI V- 
1 GagPol). Design of a codon optimized SIV GagPol gene encoding packaging functions. A synthetic A synthetic 
sequence coding for the SIV mac239 (SIVmm239, GenBank Accession: M33262) Pr160 Ga 9P<>i poly protein was generated 

25 by backtranslating the amino-acid sequence of the GagPol coding region (nucleotides 1 - 4358), except the region 
where gag and po/ reading frames are overlapping (nucleotides 1299 - 1533), and by using a codon-usage occurring 
most frequently in mammalian cells. In order to fragmentize this synthetic SIV derived GagPol reading frame, restriction 
sites were introduced at positions 120 (BstX\), 387 (Xmalll), 650 (Psti), 888 (Nde\), 1224 (ApaX), 1422 (Nsp\), 1583 
(E/?el), 1899 (Sau\), 2019 (BsaB), 2372 (EcoNl), 2651 (Sadl), 2785 (Bcli), 3025 (BamH\), 3268 (S(yl), 3573 (Bali), 

30 3745 (Saul), 3923 (PflMi), and 41 88 (Apa\) by silent mutations. For cloning, additional restriction sites Kpn\, Xhol and 
Sad were introduced within non-coding regions. Synthetic fragments spanning regions between restriction sites KpnU 
BstXi (F1), BsfXI/Xmalll (F2), Xma\\\/Pst\ (F3), Pst\/Nde\ (F4), Nde\/Apa\ (F5), Apa\/Nsp\ (F6), Nsp\/Ehe\ (F7), Ehel/ 
Sau\ (F8), SauVBsaBi (F9), BsaBUEcoNl (F10), EcoN\/Sac\\ (F11), SacWBcli (F12), Bcll/BamHl (F13), BamHl/Sty\ 
(F14), StyUBali (F15), Bal\/Sau\ (F16), Saul/Pfm (F17), PflMi/Apa\ (F18) and >4pal/Sad (F19) were created by a 

35 stepwise PCR-amplification of overlapping 60 nt long oligonucleotides and subcloned using the pCR-Script™ Amp SK 
(+) Cloning Kit (Stratagene, Heidelberg, Germany) following the manufactures instructions. The unique restriction sites 
Kpn\, BstXl, Xmalll, Psfl, Nde\, Apa I and Sad were used to assemble the fragments f1 to f6 into one intermediate 
fragments (f1 -6). The fragments f7 to f11 were assembled into a second intermediate fragment (f7-11 ) using the unique 
restriction sites Kpnl, Ehe\, Sau\, BsaB\, EcoNl and Sad. The unique restriction sites Kpn\, Bell, BamHl, Sty\, Bali, 

40 Saul, PflMi, Apai and Sad were used to assemble the fragments f 1 2 to f 1 9 into a third intermediate fragment (f 1 2-1 9). 
The intermediate fragments f1-6, f7-11 and f12-19 were finally assembled into the full-length synthetic GagPol gene 
of SIV using restriction sites Kpnl/Nspl, Nsp\/Saci\, and Sac\i/Sac\ respectively, and placed into the Kprii and Sad 
restriction sites of the pCR-Script™ Amp SK(+) cloning vector (Stratagene, Heidelberg, Germany). In order to obtain 
high level, constitutive expression in mammalian cells the SIV derived synthetic GagPol gene was placed into the Kpni 

45 and Xhoi restriction sites of pcDNA3.1(+) expression vector (Invitrogen, Leek, The Netherlands) under the transcrip- 
tional control of the immediate-early promoter-enhancer of the Cytomegalovirus (CMV) resulting in plasmid Sgp 5 *" 
(encoding SIV mac239 GagPol). 

[0105] Codon usage of the 4.3 kb long GaprPo/genes of HIV^1 and SIV was optimized for expression in human cells 
by assembling the entire reading frame from synthetic oligonucleotides. The synthetic genes encode GagPol proteins 
50 which have the same amino acid sequence as wildtype SIV or HIV-1 GagPol. On the nucleotide level there is an overall 
identity of 69,27 % for the synthetic and wildtype GagPol of HIV-1 and of 73,24% for the synthetic and wildtype SIV 
GagPol genes (Table 1). 
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Table 1 : 



Genetic distances between the different SIV and HIV-1 GagPol coding sequences 


GagPol gene 




SIV" 1 


HIV 5 *" 


HIV** 


siv s y n 


0.00 


29.81 


36.16 


72.21 


SIV** 




0.00 


69.78 


54.72 


Hiv s / n 






0.00 


30.73 


Hl\/wt 








0.00 



[0106] Distance matrix of SIV mac239 and HIV mB derived synthetic (syn) and wiidtype (wt) GagPol coding sequences. 
Numbers indicate the percentage of missmatches between the compared GagPol sequences. 
15 [0107] Optimizing the codon usage of HIV-1 GagPol reduces the homology to wiidtype SIV GagPolby 41 .48% and 
vice versa (Table 1). The longest stretch of nucleotide identity between synthetic and wiidtype GagPol is 357 bp in SIV 
and 306 bp in HIV-1 . This stretch is located in the p69ag/ p 6P°i overlap region containing the slippery sequence and a 
stem loop structure, both of which are required for proper ribosomal frameshifting. 

O 20 Example 5: 

[0108] Expression of synthetic GagPol genes. Rev-independent expression of the synthetic HIV-1 and SIV GagPol 
genes was assayed in transient transfection experiments followed by characterisation of the GagPol particles by con- 
ventional immunoblot analysis. Briefly, the day prior to transfection 2x1 0 6 H1 299 ceils were plated on 100 mm-diameter 

25 culture dishes and incubated for 24 hours. Cells were transfected using 45u,g of the synthetic GagPol expressing 
plasmids or proviral DNA (HX1 0 and SIV mac239 ) with the calcium phosphate coprecipitation method as described (Gra- 
ham and Van der EB 1973, Virol. 52, 456-467). Cell culture supernatants were harvested 72 hours post transfection. 
To block processing of GagPol, plated-cells were cultured and transfected in the presence of 1 0 u.M Saquinavir. GagPol 
particles were purified by separating the culture supernatants on 10-60% sucrose gradients as previously described 

30 (Wagner et al. 1994, Virology 200:162-175). GagPol particles banded at the expected density of 1 .13 to 1.19 g/cm 2 , 
typical for HIV virions. The fractions with highest particle content were dialized against PBS and further separated by 
electrophoresis on a denaturing SDS 12.5% polyacrylamide gel prior to transfere onto nitrocellulose membrane by 
electroblotting. HIV-1 p24 capsid (CA) and SIV p27 CA and their precursor proteins were detected by the monoclonal 
antibodies 13-5 (Wolf era/. 1990, AIFO 1:24-29.) and 55-2F12 (NIH, Rockville, USA), respectively. 

35 [01 09] Western Blot analyses from supernatants of cells transfected with expression plasmids for the synthetic HIV- 
1 and SIV GagPol genes (Hgp s y\ Sgp s y" in Fig. 5 B) revealed efficient expression of GagPol in the absence of Rev 
(Fig. 5 A, B). The GagPol levels obtained with the synthetic genes were significantly higher as the levels obtained by 
transfection of infectious proviral constructs in the presence of Rev as assessed by a commercial capture ELISA format. 
Subviral particles were readily released from cells transfected with the synthetic GagPol expression plasmids and 
( ' \ 40 sedimentate at a density of about 1,16 g/ml, comparable to wiidtype SIV or HIV-1 virions. No differences were observed 
in the processing of the GagPol precursor proteins encoded by the wiidtype viruses and the corresponding synthetic 
GagPol genes (Fig. 5 A, B). 

Example 6: 

45 

[0110] Vector production and infection of target cells. The 293T cells were transfected with the calcium phosphate 
coprecipitation method as described (Graham and Van der EB 1973, Virol. 52, 456-467). The plasmids pHIT60 (MLV 
GagPol expression plasmid), pHIT-G (VSV-G expression plasmid), SgpA2 (SW- GagPol expression plasmid also ex- 
pressing vif t vpr, vpx, tat and rev), and ViGABH (self- in activating SIV vector expressing the green fluorescent protein 

50 gene (GFP) under the control of the promoter and enhancer from spleen focus forming virus) have been described 
previously (Graham and Van der EB 1973, Virol. 52, 456-467; Fouchier etal. 1997, EMBO 16, 4531-4539). The Murine 
leukemia virus (MLV) vector pLEGFP-N1 , which contains the GFP gene underthe control of an internal CMV promoter, 
was obtained from Clontech. The supernatant of the transfected cells was passed through a 0.45 ujn filter and stored 
in aliquots at -80 Q C. The vector stocks were assayed for CA antigen levels using an HIV p24 enzyme-linked immuno- 

55 sorbent assay (ELISA) kit (Innogenetics, Gent, Belgium). The standard included in the ELISA kit was used for quanti- 
fication of HIV-1 p24 CA antigen levels. Recombinant SIV p28 CA antigen from the SIV p28 antigen capture assay 
from the AIDS vaccine program (NCI-Frederic cancer research and development center) served as a standard for the 
quantification of SIV p28 levels. 
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[0111] . To determine vector titers, 293 cells were seeded at 2.5 x 10 4 cells per well of a 24-well plate. One day later, 
the medium was removed and cells were Incubated for 2 to 4 hours with serial dilutions of the vector preparations in 
a total volume, of 200 Fresh medium was added and the number of GFP positive cells was determined two days 
after infection. To arrest 293 cells in the Gl/S phase of the cell cycle, cells were seeded in a 96-well plate at a density 

5 of 2 to 5 x 10 4 cells per well at a concentration of 1, 3, or 5 \ig aphidicolin/ml (Sigma). The titer of the vector stocks 
was determined on these cells as described above with aphidicolin being present during the entire culture period. 
[0112] The functional integrity of the GagPol proteins being encoded by synthetic genes was analysed by cotrans- 
fection of the synthetic GagPol expression plasmids and an expression plasmid for VSV-G (pHIT-G) with the SIV vector 
ViGABH (Fig. 6). This is a self-inactivating vector, which expresses the green fluorescent protein gene (GFP) under 

10 the control of an internal promoter. The vector titers in the supernatant of the transfected cells were in the range of 
1x10 6 GFP-forming units/ml supernatant (Table 2). Similar titers were obtained after cotransfection of ViGABH with a 
wildtype GagPol expression plasmid of SIV (SgpA2). The synthetic HIV-1 GagPol expression plasmid also allowed 
efficient transfer of the SIV vector (Table 2). The infectivity of the vector particles varied less than 2-fold between the 
different GagPol expression plasmids used (Table 2). Therefore, HIV-1 GagPol must recognise all c/s-acting sequences 

15 of SIV required for packaging, reverse transcription, and integration with similar efficiency as SIV. As using packaging 
functions the results clearly extent a previous report on the packaging of HIV-2 RNA, which is closely related to SI Vmac, 
in HIV-1 particles (Kaye and Lever, 1 998, J.Virol. 72, 5877-5885). 



Table 2. 



Titer and infectivity of an SIV vector packaged by synthetic SIV and HIV 1 GagPol expression plasmids 




Titer* 




Plasmids transfected 


Experiment V 


Experiment 2 


Infectivity** 


SgpA2, ViGABH, VSV-G 


5.0x10 s 


1.4 x106 


5.9x10 5 


Sgp s y\ ViGABH, VSV-G 


1.0x10 s 


2.7x105 


3.4x1 0 5 


Hgpsy", ViGABH, VSV-G 


2.0x1 0 6 


2.0x1 0 6 


3.1x10 s 


ViGABH, VSV-G 


<5 


<5 


n.a. 



a GFP forming units (GFU) per ml supernatant of 293T cells transfected with the indicated plasmids; 



b GFU/ng CA antigen in the supernatant of transfected 293T cells; n.a.: not applicable. 

[01 13] Omitting the GagPol expression plasmid reduced the titer below the level of detection (Table 2), which argues 
against VSV-G mediated pseudotransduction. In addition, the percentage of GFP positive cells transduced with ViGA- 
BH, which had been packaged by Sgp^, Hgp s v n , or SgpA2, did not decrease significantly during a six week observation 
period (data not shown), which suggests integration of the vector into the host genome. As previously observed for 
another SIV vector packaged by SgpA2 and pHIT-G replication competent recombinants could not be detected during 
an 8 week observation period in ViGABH vector preparations packaged by SgpA2, Sgp**", or Hgpsy" and pHIT-G (data 
not shown). The transduction efficiency of vectors produced with the synthetic GagrPo/genes for non-dividing cells was 
assessed by arresting the target cells in the G1 phase of the cell cycle by aphidicolin treatment. The titer of an MLV- 
based vector in growth-arrested cells was reduced to background levels (Fig: 7). In contrast, the SIV vector titer was 
only slightly reduced by aphidicolin treatment. Non-dividing cells could be transduced with the SIV vector independent 
of the GagPol expression plasmid used for the production of the vector particle (Fig. 7). 



Example 7: 

[01 1 4] CEMxl 74 cells (5x1 0 5 ) were infected for 4 hours with 0.5 ml of the vector preparation. The cells were washed 
three times with PBS and subsequently cultured in 5 ml medium. Infected cultures were split 1 :5 to 1 :10 twice weekly 
for eight weeks. The capsid antigen levels in theses cultures were determined 1 and 4 days, and 2, 4, 6 and 8 weeks 
after infection using the HIV-lnnogenetics ELISA and the recombinant SIV p28 antigen as a standard. To determine 

50 the emergence of RCR on CEMx174-SIV-SEAP cells (Means et al., 1997, J.Virol. 71, 7895-7902), 1x10 5 cells were 
infected with 1 ml supernatant of the transfected cells for two hours and washed once in PBS. Infected cultures were 
cultured in 5 ml medium and split 1 :5 to 1 :10 twice weekly for four weeks. Secreted alkaline phosphatase activity in 
the supernatant of these cultures was determined at different time points after infection with the Phospha-Light-kit 
(Tropix, Bedford, MA, USA) as described by the manufacturer. The titer of RCRs was determined by limiting dilution. 

55 CEMx174-SIV-SEAP cells (2x1 0 5 ) were incubated with 1 ml supernatant in a final volume of 2 ml. After two hours, six 
200 u.l aliquots were transferred to a 96-well plate and used as a starting point for 4-fold serial dilutions. Cultures were 
split 1:10 after one week. Syncytia formation was scored 1 and 2 weeks after infection and the TCIDgQ of the RCRs 
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was calculated as described (Johnson and Byington, 1990, Quantitative assays for virus infectivity. In Techniques in 
HIV research. A. Aldovini and B.D. Walker, eds. (New York: Stockton Press), pp. 71-76. 1990). 
[01 15] Detection of replication-competent recombinants. The potential emergence of replication competent recom- 
binants (RCR) from our vector preparation was analysed in CEMxl 74 cells, which support replication of SIV and SIV 
5 hybrid viruses utilising a heterologous envgene (Reiprich et al., 1 997, J.Virol. 71, 3328-3331 ). These cells were infected 
with 0.5 ml of ViGABH vector preparations (for titers see example 6, Table 2) packaged by SgpA2, Sgp s y\ or Hgp s v n 
and pHIT-G. Slightly elevated capsid antigen levels (<400 pg SIVp27CA/ml) could be detected one and four days after 
infection. This has been observed previously and is probably due to carry over CA-antigen from the vector preparation 
(Schnell et al., 2000, Hum.Gene Ther. 11, 439-447). More importantly, CA antigen levels were below the level of de- 
w tection of 20 pg SIVp27/ml two to eight weeks after infection in all cultures and no cytopathic effects could be observed. 
Therefore; the titer of a potential RCR is below 2 infectious unit/ml. Since no RCR could be detected with the wild-type 
SW-gag-pol expression plasmid, SgpA2, this assay did not reveal potential advantages of the synthetic gag-pol ex- 
pression plasmids. We therefore tested the frequency of emergence of RCR in an assay system, that would only require 
two homologous recombination events. SgpA2, Sgp s v n , or Hgp 5 ^ was cotransfected with either of two SIV vectors, 
15 that still contained env, vif, vpr, vpx,tat, and rev in addition to the GFP reporter gene, but which contained large deletions 
in gag-pol The CEMxl 74-SIV-SEAP cells, which release SEAP after infection with SIV due to Tat transactivation 
(Means et al., 1 997, J.Virol. 71 , 7895-7902), were infected with 1 ml of the supernatant of the transfected cells. Three 
days after infection elevated SEAP activity was observed in cultures infected with the SIV vectors packaged by the 
synthetic gag-pol expression plasmids (Fig. 8). This should be due to transcomplementation of gag-pol and subsequent 
f-^20 transfer of the SIV vector. However, the SEAP activity returned to background levels during the next 3 weeks (Fig. 8) 
' - J indicating the absence of RCR. In contrast, increasing SEAP activity three to ten days after infection revealed rapid 
emergence of RCR in SIV vectors packaged by SgpA2. This was accompanied by extensive syncytia formation and 
cell death. Using a limiting dilution approach, the titer of the RCR in the supernatant of cells transfected with SgpA2 
and SIV-GFPADP or SgpA2 and SIV-GFPABP was determined to be 130 and 75 TCIDso/ml, respectively. This dem- 
25 onstrates that if RCR emerge at all with the synthetic gag-pot genes, the frequency is at least reduced by a factor of 
approximately 100 in comparison to a wild-type gag-pol expression plasmid. 
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SEQUENCE LISTING 

' <110> Wagner, Ralf 

<120> Synthetic gagpol genes and their uses 

<130> WAG-002 

<140> XX 

<141> 2000-05-18 

<160> 5 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 4358 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: sequence with 
optimized codons 



<400> 1 














atgggcgtga 


ggaacagcgt 


gctgagcggc 


aagaaggccg 


acgagctgga 


gaagatcagg 


60 


ctgaggccca 


acggcaagaa 


gaagtatatg 


ctgaagcacg 


tggtgtgggc 


cgccaacgag 


120 


ctggacaggt 


tcggcctggc 


cgagagcctg 


ctggagaaca 


aggagggctg 


ccagaagatc 


180 


ctgagcgtgc 


tggcccccct 


ggtgcccacc 


ggcagcgaga 


acctgaagag 


cctgtacaac 


240 


accgtgtgcg 


tgatctggtg 


catccacgcc 


gaggagaagg 


tgaagcacac 


cgaggaggcc 


300 


aagcagatcg 


tgcagaggca 


cctggtggtg 


gagaccggca 


ccaccgagac 


catgcccaag 


360 


accagcaggc 


ccaccgcccc 


cagctccggc 


cgcggcggca 


actaccccgt 


gcagcagatc 


420 


ggcggcaact 


acgtgcacct 


gcccctgagc 


cccaggaccc 


tgaacgcctg 


ggtgaagctg 


480 


atcgaggaga 


agaagttcgg 


cgccgaggtg 


gtgcccggct 


tccaggccct 


gagcgagggc 


540 


tgeacccctt 


acgacatcaa 


ccagatgctg 


aactgcgtgg 


gcgaccacca 


ggccgccatg 


600 


cagatcatca 


gggacatcat 


caacgaggag gccgccgact 


gggacctgca 


gcaccctcag 


660 


cccgcccctc 


agcagggcca gctgagggag 


cccagcggca 


gcgacatcgc 


cggcaccaca 


720 


agcagcgtgg 


acgagcagat 


ccagtggatg 


tacaggcagc 


agaaccctat 


ccccgtgggc 


780 


aacatctaca 


ggaggtggat 


ccagctgggc 


ctccagaagt 


gcgtgaggat 


gtacaacccc 


840 


acaaacatcc 


tggacgtgaa 


gcagggacca 


aaggagccct 


tccagtcata 


tgtggacagg 


900 


ttctacaaga 


gcctgagggc 


cgagcagacc 


gacgccgccg 


tgaagaactg 


gatgacccag 


960 


accctgctga 


tccagaacgc 


caaccccgac 


tgcaagctgg 


tgctgaaggg 


cctgggcgtg 


1020 


aaccccaccc 


tggaggagat 


gctgaccgcc 


tgccagggcg 


tgggcggccc 


cggccagaag 


1080 



18 



EP1 156 112 A1 





. gctaggctga 


tggccgaggc 


tctgaaggag gccctggccc 


ccgtgcccat 


ccccttcgcc 


1140 




gccgcccagc 


agaggggacc 


caggaagccc 


atcaagtgct 


ggaactgcgg 


caaggagggc 


1200 




cacagcgcca 


ggcagtgcag 


ggcccccagg 


aggcagggct 


gctggaagtg 


cggcaagatg 


1260 


5 


gaccacgtga 


tggccaagtg 


ccccgacagg 


caggccggtt 


ttttaggcct 


tggtccatgg 


1320 




ggaaagaagc 


cccgcaattt 


ccccatggct 


caagtgcatc 


a 9999ctgat 


gccaactgct 


1380 




cccccagagg 


acccagctgt 


ggatctgcta 


aagaactaca 


tgcagttggg 


caagcagcag 


1440 


10 


agagaaaagc 


agagagaaag 


cagagagaag 


ccttacaagg 


aggtgacaga 


ggatttgctg 


1500 


cacctcaatt 


ctctctttgg 


aggagaccag 


tagtgaccgc 


ccacatcgag 


ggccagcccg 


1560 




tggaggtgct 


gctggacacc ggcgccgacg 


acagcatcgt 


gaccggcatc 


gagctgggac 


1620 




cccactacac 


ccccaagatc 


gtgggcggca 


tcggcggctt 


catcaacaca 


aaggagtaca 


1680 


15 


agaacgtgga 


gatcgaggtg 


ctgggcaaga 


ggatcaaggg 


caccatcatg 


accggcgaca 


1740 


cccccatcaa 


catcttcggc 


aggaacctgc 


tgaccgccct 


gggcatgagc 


ctgaacttcc 


1800 




ccatcgccaa 


ggtggagccc 


gtgaaggtgg 


ccctgaagcc 


cggcaaggac 


ggccccaagc 


1860 




tgaagcagtg 


gcctctgagc aaggagaaga 


tcgtggccct 


gagggaaatc 


tgcgagaaga 


1920 


f—\20 


tggagaagga 


cggccagctg gaggaggccc 


ctcccaccaa 


cccctacaac 


acccccacct 


1980 


tcgccatcaa 


gaagaaggac 


aagaacaagt 


ggaggatgct 


gatcgacttc 


agggagctga 


2040 




acagggtgac 


acaggacttc 


accgaggtgc 


agctgggcat 


ccctcacccc 


gccggcctgg 


2100 




ccaagaggaa 


gaggatcacc 


gtgctggaca 


tcggcgacgc 


ctacttcagc 


atccctctgg 


2160 


25 


acgaggagtt 


caggcagtac 


accgccttca 


ccctgcccag 


cgtgaacaac 


gccgagcccg 


2220 




gcaagaggta 


catctacaag gtgctgcccc 


agggctggaa 


gggcagcccc 


gccatcttcc 


2280 




agtacaccat 


gaggcacgtg ctggagccct 


tcaggaaggc 


caaccccgac 


gtgaccctgg 


2340 




tgcagtacat 


ggacgacatc 


ctgatcgcct 


ccgacaggac 


cgacctggag 


cacgacaggg 


2400 


30 


tggtgctcca 


gagcaaggag 


ctgctgaaca 


gcatcggctt 


cagcaccccc 


gaggagaagt 


2460 




tccagaagga 


ccctcccttc cagtggatgg gctacgagct 


gtggcccacc 


aagtggaagc 


2520 




tccagaagat 


cgagctgccc 


cagagggaga 


cctggaccgt 


gaacgacatc 


cagaagctgg 


2580 




tgggcgtgct 


gaactgggcc 


gcccagattt 


accccggcat 


caagaccaag 


cacctgtgca 


2640 


35 


ggctgatccg 


cggcaagatg acactgaccg 


aggaggtgca 


gtggaccgag 


atggccgagg 


2700 




ccgagtacga 


ggagaacaag 


atcattctga 


gccaggagca 


ggagggctgc 


tactaccagg 


2760 




agggcaagcc 


cctggaggcc 


accgtgatca 


agagccagga 


caaccagtgg 


agctacaaga 


2820 




tccaccagga 


ggacaagatc 


ctgaaggtgg 


gcaagttcgc 


caagatcaag 


aacacccaca 


2880 


>. v / 


ccaacggcgt 


gaggctgc tg 


gcccacgtga 


tccagaagat 


cggcaaggag 


gccatcgtga 


2940 




tctggggcca 


ggtgcccaag 


ttccacctgc 


ccgtggagaa 


ggacgtgtgg 


gagcagtggt 


3000 




ggaccgacta 


ctggcaggtg 


acatggatcc 


ccgagtggga 


cttcatcagc 


acccctcctc 


3060 




tggtgaggct 


ggtgttcaat 


ctggtgaagg 


accccatcga 


gggcgaggag 


acctactaca 


3120 


45 


ccgacggcag 


ctgcaacaag 


cagagcaagg 


agggcaaggc 


cggctacatfc 


accgacaggg 


3180 




gcaaggacaa 


ggtgaaggtg 


ctggagcaga 


ccaccaacca 


gcaggccgag 


ctggaggcct 


3240 




tcctgatggc 


cctgaccgac agcggcccca aggccaacat 


catcgtggac 


agccagtatg 


3300 




tgatgggcat 


catcaccggc 


tgccccaccg 


agagcgagag 


caggctggtg 


aaccagatca 


3360 


50 


tcgaggagat 


gattaagaag 


agcgagattt 


acgtggcctg 


ggtgcccgcc 


cacaagggca 


3420 




tcggcggcaa 


ccaggagatc 


gaccacctgg 


tgagccaggg 


catcaggcag 


gtgctgttcc 


3480 




tggagaagat 


cgagcccgcc 


caggaggagc 


acgacaagta 


ccacagcaac 


gtgaaggagc 


3540 




tggtgttcaa 


gttcggcctg 


cccaggatcg 


tggccaggca 


gatcgtggac 


acctgcgaca 


3600 


55 


agtgccacca 


gaagggcgag 


gccatccacg 


gccaggccaa 


cagcgacctg 


ggcacctggc 


3660 
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agatggactg 


cacccacctg 


gagggcaaga 


tcatcatcgt 


ggccgtgcac 


gtggctagcg 


3720 




gcttcatcga 


ggccgaggtg 


atccctcagg 


agaccggcag 


gcagaccgcc 


ctgttcctgc 


3780 




tgaagctggc 


cggcaggtgg 


cccatcaccc 


acctgcacac 


cgacaacggc 


gccaacttcg 


3840 


5 


ccagccagga 


ggtgaagatg 


gtggcctggt 


gggccggcat 


cgagcacacc 


ttcggcgtgc 


3900 




cctacaaccc 


ccagagccag 


ggcgtggtgg 


aggccatgaa 


ccaccacctg 


aagaaccaga 


3960 




tcgacaggat 


cagggagcag 


gccaacagcg 


tggagaccat 


cgtgctgatg 


gccgtgcact 


4020 


10 


gcatgaactt 


caagaggagg 


ggcggcatcg 


gcgacatgac 


ccccgccgag 


aggctgatca 


4080 


acatgattac 


caccgagcag 


gagatccagt 


tccagcagag 


caagaacagc 


aagttcaaga 


4140 




acttcagggt 


gtattacagg 


gagggcaggg 


accagctgtg 


gaagggcccc 


ggcgagctgc 


4200 




tgtggaaggg 


cgagggcgct 


gtgatcctga 


aggtgggcac 


cgacatcaag 


gtggtgccca 


4260 


15 


ggaggaaggc 


caagatcatc 


aaggactacg 


gcggcggcaa 


ggaggtggac 


agcagcagcc 


4320 


acataaaaaa 


caccoacaaa 


occaqqaaaa 


taacctaa 






4358 


20 


<210> 2 
<211> 4343 
<212> DNA 
















<213> Artificial Sequence 










25 


<220> 
















<223> Description of Artificial Sequence: sequence with 






optimized codons 










30 


<400> 2 
















atgggcgcca 


gggccagcgt 


gctgagcggc 


ggcgagctgg 


acaggtggga 


gaagatcagg 


60 




ctgaggcccg 


gcggcaagaa 


gaagtataag 


ctgaagcaca 


tcgtgtgggc 


cagcagggag 


120 


jo 


ctggagaggt 


tcgccgtgaa 


ccccggcctg 


ctggagacca 


gcgagggctg 


caggcagatc 


180 




ctgggccagc 


tgcagcccag 


cctgcagacc 


ggcagcgagg 


agctgaggag 


cctgtacaac 


240 




accgtggcca 


ccctgtactg 


cgtgcaccag 


aggatcgaga 


tcaaggacac 


caaggaggcc 


300 




ctggacaaga 


tcgaggagga 


gcagaacaag 


tccaagaaga 


aggcccagca 


ggccgccgcc 


360 


40 


gacaccggcc 


acagcagcca ggtgagccag aactacccca 


tcgtgcagaa 


catccagggc 


420 




cagatggtgc 


accaggccat 


cagccccagg 


accctgaacg 


cctgggtgaa 


ggtggtggag 


480 




gagaaggcct 


tcagccccga 


ggtgatcccc 


atgttcagcg 


ccctgagcga 


gggagccacc 


540 




ccccaggacc 


tgaacaccat 


gctgaacacc gtgggcggcc 


accaggccgc 


catgcagatg 


600 


45 


ctgaaggaga 


ccatcaacga 


ggaggccgcc 


gagtgggaca 


gggtgcadcc 


cgtgcacgcc 


660 




ggccccatcg 


cccccggcca 


gatgagggag 


ccccgcggca 


gcgacatcgc 


cggcaccacc 


720 




agcaccctgc 


aggagcagat 


cggctggatg 


accaacaacc 


cccccatccc 


cgtgggcgaa 


780 




atctacaaga 


ggtggatcat 


cctgggcctg 


aacaagatcg 


tgaggatgta 


cagccccacc 


840 


50 


agcatcctgg 


atatcaggca 


gggccccaaa 


gagcccttca 


gggactacgt 


ggacaggttc 


900 




tacaagaccc 


tgcgcgccga 


gcaggccagc 


caggaggtga 


agaactggat 


gaccgagacc 


960 




ctgctggtgc 


agaacgccaa 


ccccgactgc 


aagaccatcc 


tgaaggccct 


gggacccgcc 


1020 




gccaccctgg 


aggagatgat 


gaccgcctgc 


cagggcgtgg 


gcggccccgg 


ccacaaggcc 


1080 


55 


agggtgctgg 


ccgaggccat 


gagccaggtg 


accaacaccg 


ccaccatcat 


gatgcagagg 


1140 
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ggcaacttca ggaaccagag gaagatggtg aagtgcttca actgcggcaa ggagggccac 1200 
accgccagga actgccgcgc ccccaggaag aagggctgct ggaagtgcgg caaggagggc 1260 
caccagatga aggactgcac cgagaggcag gctaattttt tagggaagat ctggccttcc 1320 
tacaagggaa ggccagggaa ttttcttcag agcagaccag agccaacagc cccaccattt 1380 
cttcagagca gaccagagcc aacagcccca ccagaagaga gcttcaggtc tggggtagag 1440 
acaacaactc cccctcagaa gcaggagccg atagacaagg aactgtatcc tttaacttcc 1500 
ctcagatcac tctttggcaa cgacccctcg tcacaataaa gatcggtggc cagctgaagg 1560 
aggccctgct ggacaccggc gccgacgaca ccgtgctgga ggagatgagc ctgcccggca 1620 
ggtggaagcc caagatgatc ggcggcatcg gcggcttcat caaggtgagg cagtacgacc 1680 
agatcctgat cgagatctgc ggccacaagg ccatcggcac cgtgctggtg ggccccaccc 1740 
ccgtgaacat catcggcagg aacctgctga cccagatcgg ctgcaccctg aacttcccca 1800 
tcagccccat cgagaccgtg cccgtgaagc tgaagcccgg catggacggc cctaaggtga 1860 
agcagtggcc cctgaccgag gagaagatca aggccctggt ggagatctgc accgagatgg 1920 
agaaggaggg caagatcagc aagatcggcc ccgagaaccc ctacaacacc cccgtgttcg 1980 
ccatcaagaa gaaggacagc accaagtgga ggaagctggt ggacttcagg gagctgaaca 2040 
agaggaccca ggacttctgg gaggtgcagc tgggcatccc ccaccccgcc ggcctgaaga 2100 
agaagaagag cgtgaccgtg ctggacgtgg gcgacgccta cttcagcgtg cccctggacg 2160 
aggacttcag gaagtatacc gccttcacca tccccagcat caacaacgag acccccggca 2220 
tccgctacca gtacaacgtg ctgccccagg gctggaaggg cagccccgcc atcttccaga 2280 
gcagcatgac aaagatcctg gagcccttca agaagcagaa ccccgacatc gtgatctatc 2340 
agtacatgga cgacctgtac gtgggcagcg acctggagat cggccagcac aggaccaaga 2400 
tcgaggagct gaggcagcac ctgctgaggt ggggcctgac cacccccgac aagaagcacc 2460 
agaaggagcc cccattcctg tggatgggct acgagctgca ccccgacaag tggaccgtgc 2520 
agcccatcgt gctgcccgag aaggacagct ggaccgtgaa cgacattcag aagctggtgg 2580 
gcaagctgaa ctgggccagc cagatctacc ccggcatcaa ggtgaggcag ctgtgcaagc 2640 
tgctgagggg cacaaaggct ctgaccgagg tgatccccct gaccgaggag gccgagctgg 2700 
agctggccga gaacagggag atcctgaagg agcccgtgca cggcgtgtac tacgacccca 2760 
gcaaggacct gatcgccgag atccagaagc agggccaggg ccagtggacc taccagatct 2820 
accaggagcc cttcaagaac ctgaagaccg gcaagtacgc ccgcatgcgc ggcgcccaca 2880 
ccaacgacgt gaagcagctg accgaggccg tgcagaagat caccaccgag agcatcgtga 2940 
tctggggcaa gacccccaag ttcaagctgc ccatccagaa ggagacctgg gagacctggt 3000 
ggaccgagta ctggcaggcc acctggattc ccgagtggga gttcgtgaac acccctcccc 3060 
tggtgaagct gtggtatcag ctggagaagg agcccatcgt gggcgccgag accttctacg 3120 
tggacggcgc cgccaacagg gagaccaagc tgggcaaggc cggctacgtg accaacaagg 3180 
gccgccagaa ggtggtgccc ctgaccaaca ccaccaacca gaagaccgag ctgcaggcta 3240 
tctaccfcggc cctgcaggac tcaggcctgg aggtgaacat cgtgaccgac agccagtacg 3300 
ccctgggcat catccaggcc cagcccgaca agagcgagag cgagctggtg aaccagatca 3360 
tcgagcagct gatcaagaag gagaaggtgt acctggcctg ggtgcccgcc cacaagggca 3420 
tcggcggcaa cgagcaggtg gacaagctgg tgagcgccgg catcaggaag atcctgttcc 34 80 
tggacggcat cgacaaggcc caggacgagc acgagaagta ccacagcaac tggagggcta 3540 
tggctagcga cttcaacctg cctcccgtgg tggctaagga gatcgtggcc agctgcgaca 3600 
agtgccagct gaagggcgag gccatgcacg gccaggtgga ctgcagcccc ggcatctggc 3660 
agctggactg cacccacctg gagggcaagg tgatcctggt ggccgtgcac gtggcctccg 3720 
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gctacatcga ggccgaggtg atccccgccg agaccggcca ggagaccgcc tacttcctgc 3780 
tgaagctggc cggccgctgg cccgtgaaga ccatccacac cgacaacggc agcaacttca 3840 
■ccagcgccac cgtgaaggcc gcctgctggt gggccggcat caagcaggag ttcggcatcc 3 900 
cctacaaccc ccagtctcag ggcgtggtgg agagcatgaa caaggagctg aagaagatca 3960 
tcggccaggt gagggaccag gccgagcacc tgaagaccgc cgtgcagatg gccgtgttca 4020 
tccacaactt caagaggaag ggcggcatcg gcggctacag cgccggcgag aggatcgtgg 4080 
acatcatcgc caccgacatc cagaccaagg agctgcagaa gcagatcacc aagatccaga 4140 
acttcagggt gtactacagg gacagcagga accctctgtg gaagggcccc gccaagctgc 4200 
tgtggaaggg cgagggcgcc gtggtgatcc aggacaacag cgacatcaag gtggtgccca 4260 
ggaggaaggc caagatcatc agggactacg gcaagcagat ggccggcgac gactgcgtgg 4 320 
cctccaggca ggacgaggac tga 4343 

<210> 3 
<211> 4341 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: sequence with 
optimized codons 



«c400> 3 














atgggcgcca 


gggccagcgt 


gctgagcggc 


ggcgagctgg 


acaggtggga 


gaagatcagg 


60 


ctgaggcccg 


gcggcaagaa 


gaagtataag 


ctgaagcaca 


tcgtgtgggc 


cagcagggag 


120 


ctggagaggt 


tcgccgtgaa 


ccccggcctg 


ctggagacca 


gcgagggctg 


caggcagatc 


180 


ctgggccagc 


tgcagcccag 


cctgcagacc 


ggcagcgagg 


agctgaggag 


cctgtacaac 


240 


accgtggcca 


ccctgtactg 


cgtgcaccag 


aggatcgaga 


tcaaggacac 


caaggaggcc 


300 


ctggacaaga 


tcgaggagga 


gcagaacaag tccaagaaga 


aggcccagca 


ggccgccgcc 


360 


gacaccggcc 


acagcagcca 


ggtgagccag 


aactacccca 


tcgtgcagaa 


catccagggc 


420 


cagatggtgc 


accaggccat 


cagccccagg 


accctgaacg 


cctgggtgaa 


ggtggtggag 


480 


gagaaggcct tcagccccga 


ggtgatcccc 


atgttcagcg 


ccctgagcga 


gggagccacc 


540 


ccccaggacc 


tgaacaccat 


gctgaacacc 


gtgggcggcc 


accaggccgc 


catgcagatg 


600 


ctgaaggaga 


ccatcaacga 


ggaggccgcc 


gagtgggaca 


gggtgcaccc 


cgtgcacgcc 


660 


ggccccatcg 


cccccggcca 


9*tgagggag 


ccccgcggca 


gcgacatcgc 


cggcaccacc 


720 


agcaccctgc 


aggagcagat 


cggctggatg 


accaacaacc 


cccccatccc 


cgtgggcgaa 


780 


atctacaaga 


ggtggatcat 


cctgggcctg 


aacaagatcg 


tgaggatgta 


cagccccacc 


840 


agcatcctgg 


atatcaggca 


gggccccaaa 


gagcccttca 


gggactacgt 


ggacaggttc 


900 


tacaagaccc 


tgcgcgccga 


gcaggccagc 


caggaggtga 


agaactggat 


gaccgagacc 


960 


ctgctggtgc 


agaacgccaa 


ccccgactgc 


aagaccatcc 


tgaaggccct 


gggacccgcc 


1020 


gccaccctgg 


aggagatgat 


gaccgcctgc 


cagggcgtgg 


gcggccccgg 


ccacaaggcc 


1080 


agggtgctgg 


ccgaggccat 


gagccaggtg 


accaacaccg 


ccaccatcat 


gatgcagagg 


1140 


ggcaacttca 


ggaaccagag 


gaagatggtg 


aagtgcttca 


actgcggcaa 


ggagggccac 


1200 
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accgccagga actgccgcgc ccccaggaag aagggctgct ggaagtgcgg caaggagggc 1260 
.. . .ca.ccagatga aggactgcac cgagaggcag gctaatttta gggaagatct ggccttccta 1320 
caagggaagg ccagggaatt ttcttcagag cagaccagag ccaacagccc caccatttct 13 8 0 
tcagagcaga ccagagccaa cagccccacc agaagagagc ttcaggtctg gggtagagac 1440 
aacaactccc cctcagaagc aggagccgat agacaaggaa ctgtatcctt taacttccct 1500 
cagatcactc tttggcaacg acccctcgtc acaataaaga tcggtggcca gctgaaggag 1560 
gccctgctgg acaccggcgc cgacgacacc gtgctggagg agatgagcct gcccggcagg 1620 
tggaagccca agatgatcgg cggcatcggc ggcttcatca aggtgaggca gtacgaccag 1680 
atcctgatcg agatctgcgg ccacaaggcc atcggcaccg tgctggtggg ccccaccccc 1740 
gtgaacatca tcggcaggaa cctgctgacc cagatcggct gcaccctgaa cttccccatc 1800 
agccccatcg agaccgtgcc cgtgaagctg aagcccggca tggacggccc taaggtgaag 1860 
cagtggcccc tgaccgagga gaagatcaag gccctggtgg agatctgcac cgagatggag 1920 
aaggagggca agatcagcaa gatcggcccc gagaacccct acaacacccc cgtgttcgcc 1980 
atcaagaaga aggacagcac caagtggagg aagctggtgg acttcaggga gctgaacaag 2040 
aggacccagg acttctggga ggtgcagctg ggcatccccc accccgccgg cctgaagaag 2100 
aagaagagcg tgaccgtgct ggacgtgggc gacgcctact tcagcgtgcc cctggacgag 2160 
gacttcagga agtataccgc cttcaccatc cccagcatca acaacgagac ccccggcatc 2220 
cgctaccagt acaacgtgct gccccagggc tggaagggca gccccgccat cttccagagc 2280 
agcatgacaa agatcctgga gcccttcaag aagcagaacc ccgacatcgt gatctatcag 2340 
tacatggacg acctgtacgt gggcagcgac ctggagatcg gccagcacag gaccaagatc 2400 
gaggagctga ggcagcacct gctgaggtgg ggcctgacca cccccgacaa gaagcaccag 2460 
aaggagcccc cattcctgtg gatgggctac gagctgcacc ccgacaagtg gaccgtgcag 2520 
cccatcgtgc tgcccgagaa ggacagctgg accgtgaacg acattcagaa gctggtgggc 2580 
aagctgaact gggccagcca gatctacccc ggcatcaagg tgaggcagct gtgcaagctg 2640 
ctgaggggca caaaggctct gaccgaggtg atccccctga ccgaggaggc cgagctggag 2700 
ctggccgaga acagggagat cctgaaggag cccgtgcacg gcgtgtacta cgaccccagc 2760 
aaggacctga tcgccgagat ccagaagcag ggccagggcc agtggaccta ccagatctac 2820 
caggagccct tcaagaacct gaagaccggc aagtacgccc gcatgcgcgg cgcccacacc 2880 
aacgacgtga agcagctgac cgaggccgtg cagaagatca ccaccgagag catcgtgatc 2940 
tggggcaaga cccccaagtt caagctgccc atccagaagg agacctggga gacctggtgg 3000 
accgagtact ggcaggccac ctggattccc gagtgggagt tcgtgaacac ccctcccctg 3060 
gtgaagctgt ggtatcagct ggagaaggag cccatcgtgg gcgccgagac cttctacgtg 3120 
gacggcgccg ccaacaggga gaccaagctg ggcaaggccg gctacgtgac caacaagggc 3180 
cgccagaagg tggtgcccct gaccaacacc accaaccaga agaccgagct gcaggctatc 3240 
tacctggccc tgcaggactc aggcctggag gtgaacatcg tgaccgacag ccagtacgcc 3300 
ctgggcatca tccaggccca gcccgacaag agcgagagcg agctggtgaa ccagatcatc 3360 
gagcagctga tcaagaagga gaaggtgtac ctggcctggg tgcccgccca caagggcatc 3420 
ggcggcaacg agcaggtgga caagctggtg agcgccggca tcaggaagat cctgttcctg 3480 
gacggcatcg acaaggccca ggacgagcac gagaagtacc acagcaactg gagggctatg 3540 
gctagcgact tcaacctgcc tcccgtggtg gctaaggaga tcgtggccag ctgcgacaag 3600 
tgccagctga agggcgaggc catgcacggc caggtggact gcagccccgg catctggcag 3660 
ctggactgca cccacctgga gggcaaggtg atcctggtgg ccgtgcacgt ggcctccggc 3720 
tacatcgagg ccgaggtgat ccccgccgag accggccagg agaccgccta cttcctgctg 3780 
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aagctggccg 


gccgctggcc 


cgtgaagacc 


atccacaccg 


acaacggcag 


ca act tcacc 


1 D A ft 


agcgccaccg 


tgaaggccgc 


ctgctggtgg gccggcatca 




CyyCat.CCCC 


q An 


tacaaccccc 


ag tct caggg 


cgtggtggag 


agcatgaaca 


ciggogc cgoci 


A « A *- /*"» ^ ^ M 

yaaydCCatC 


J ?ou 


ggccaggtga 


gggaccaggc 


cgagcacctg 


aagaccgccg 


tgcaga tggc 


cgtgt tea tc 




cacaacttca 


agaggaaggg 


cggcatcggc 


ggctacagcg 


ccggcgagag 


gatcgtggac 


4080 


atcatcgcca 


ccgacatcca 


gaccaaggag 


ctgcagaagc 


agatcaccaa 


gatccagaac 


4140 


ttcagggtgt 


actacaggga 


cagcaggaac cctctgtgga 


agggccccgc 


caagctgetg 


4200 


tggaagggcg 


agggcgccgt 


ggtgatccag 


gacaacagcg 


acatcaaggt 


ggtgcccagg 


4260 


aggaaggcca 


agatcatcag 


ggactacggc 


aagcagatgg 


ccggcgacga 


ctgcgtggcc 


4320 


tccaggcagg 


acgaggactg 


a 








4341 



<210> 4 
<211> 3581 
<212> DNA 

<213* Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: sequence with 
optimized codons 



<400> 4 



atggccgcca 


gggccagcgt 


getgagegge 


ggcgagctgg 


acaggtggga 


gaagatcagg 


60 


ctgaggcccg 


gcggcaagaa 


gaagtataag 


ctgaagcaca 


tcgtgtgggc 


cagcagggag 


120 


ctggagaggt 


tcgccgtgaa 


ccccggcctg 


ctggagacca 


gcgagggctg 


caggcagatc 


180 


ctgggccagc 


tgcagcccag 


cctgcagacc 


ggcagegagg 


agctgaggag 


cctgtacaac 


.240 


accgtggcca 


ccctgtactg cgtgcaccag 


aggatcgaga 


tcaaggacac 


caaggaggee 


300 


ctggacaaga 


tcgaggagga 


gcagaacaag 


tccaagaaga 


aggcccagca 


ggccgccgcc 


360 


gacaccggcc 


acagcagcca 


ggtgagccag 


aactacccca 


tegtgeagaa- 


catccagggc 


420 


cagatggtgc 


accaggccat 


cagccccagg 


accctgaacg 


cctgggtgaa 


ggtggtggag 


480 


gagaaggect 


tcagccccga 


ggtgatcccc 


atgttcagcg 


ccctgagcga 


gggagccacc 


540 


ccccaggacc 


tgaacaccat 


gctgaacacc 


gtgggcggcc 


accaggccgc 


catgeagatg 


600 


ctgaaggaga 


ccatcaacga 


ggaggccgcc 


gagtgggaca 


gggtgcaccc 


cgtgcacgcc 


660 


ggccccatcg 


cccccggcca 


gatgagggag 


ccccgcggca 


gcgacatcgc 


cggcaccacc 


720 


agcaccctgc 


aggagcagat 


cggctggatg 


accaacaacc 


cccccatccc 


cgtgggcgaa 


780 


atctacaaga 


ggtggatcat 


cctgggcctg 


aacaagatcg 


tgaggatgta 


cagccccacc 


840 


agcatcctgg 


atatcaggca gggccccaaa gagcccttca 


gggactacgt 


ggacaggttc 


900 


tacaagaccc 


tgcgcgccga 


gcaggccagc 


caggaggtga 


agaactggat 


gaccgaga cc 


960 


ctgctggtgc 


agaacgecaa 


ccccgactgc aagaccatcc 


tgaaggcect 


gggacccgcc 


1020 


gccaccctgg 


aggagatgat 


gaccgcctgc 


cagggcgtgg 


gcggccccgg 


ccacaaggcc 


1080 


agggtgctgg 


ccgaggccat 


gagecaggtg 


accaacaccg 


ccaccatcat 


gatgeagagg 


1140 


ggcaacttca 


ggaaccagag gaagatggtg 


aagtgcttca 


actgeggcaa 


ggagggecac 


1200 


accgccagga 


actgccgcgc 


ccccaggaag 


aagggctget 


ggaagtgcgg 


caaggagggc 


1260 
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caccagatga aggactgcac 
■caa;gggaagg ccagggaatt 
tcagagcaga ccagagccaa 
aacaactccc cctcagaagc 
cagatcactc tttggcaacg 
gccctgctgg ccaccggcgc 
tggaagccca agatgatcgg 
atcctgatcg agatctgcgg 
gtgaacatca tcggcaggaa 
agccccatcg agaccgtgcc 
cagtggcccc tgaccgagga 
aaggagggca agatcagcaa 
atcaagaaga aggacagcac 
aggacccagg acttctggga 
aagaagagcg tgaccgtgct 
gacttcagga agtatacccc 
cactttttaa aagaaaaggg 
atccttgatc tgtggatcta 
aagtggtcaa aaagtagtgt 
gagccagcag cagatggggt 
acaagtagca atacagcagc 
gaggaggtgg gttttccagt 
caccccgaca agtggaccgt 
aacgacattc agaagctggt 
aaggtgaggc agctgtgcaa 
ctgaccgagg aggccgagcc 
cacggcgtgt actacgaccc 
ggccagtgga cctaccagat 
gcccgcatgc gcggcgccca 
atcaccaccg agagcatcgt 
aaggagacct gggagacctg 
gagttcgtga acacccctcc 
gtgggcgccg. agaccttcta 
gccggctacg tgaccaacaa 
cagaagaccg agctgcaggc 
atcgtgaccg acagccagta 
agcgagctgg tgaaccagat 
tgggtgcccg cccacaaggg 
ggcatcagga agatcctgtt 
taccacagca actggagggc 
gagatcgtgg ccagcgcctt 
taccagtaca acgtgctgcc 
atgacaaaga tcctggagcc 



cgagaggcag gctaatttta 
ttcttcagag cagaccagag 
cagccccacc agaagagagc 
aggagccgat agacaaggaa 
acccctcgtc acaataaaga 
cgacgacacc gtgctggagg 
cggcatcggc ggcttcatca 
ccacaaggcc atcggcaccg 
cctgctgacc cagatcggct 
cgtgaagctg aagcccggca 
gaagatcaag gccctggtgg 
gatcggcccc gagaacccct 
caagtggagg aagctggtgg 
ggtgcagctg ggcatccccc 
ggacgtgggc gacgcctact 
tttaagacca atgacttaca 
gggactggaa gggctaattc 
ccacacacaa ggctacttcc 
ggttggatgg cctgctgtaa 
gggagcagca tctcgagacc 
taccaatgct gcttgtgcct 
cacacctcaa gtaccattcc 
gcagcccatc gtgctgcccg 
gggcaagctg aactgggcca 
gctgctgagg ggcacaaagg 
ggagctggcc gagaacaggg 
cagcaaggac ctgatcgccg 
ctaccaggag cccttcaaga 
caccaacgac gtgaagcagc 
gatctggggc aagactccta 
gtggaccgag tactggcagg 
cctggtgaag ctgtggtatc 
cgtggacggc gccgccaaca 
gggccgccag aaggtggtgc 
tatctacctg gccctgcagg 
cgccctgggc atcatccagg 
catcgagcag ctgatcaaga 
catcggcggc aacgagcagg 
cctggacggc atcgacaagg 
tatggctagc gacttcaacc 
caccatcccc agcatcaaca 
ccagggctgg aagggcagcc 
cttcaagaag cagaaccccg 



qgqaaqatct 


ggccttccta 


1320 


ccaacagccc 


caccatttct 


1380 


ttcaggtctg 


qqq t aqaqac 

33 3 w«3**3**w 


1440 


ctgtatcctt 


taacttccct 


1500 


tcggtggcca 


gctgaaggag 


1560 


agatgagcct 


qcccqgcaqq 


1620 


aqgtgaggca 


gtacgaccag 


16B0 


tgctggtggg 


acctacacct 


1740 


gcaccctgaa 


cttccccatc 


1800 


tgqaccfqccc 


taaqotqaaq 


1860 


agatctgcac 


cqaqatqqaq 


1920 


acaacacccc 


cgtgttcgcc 


1980 


acttcaggga 


gctgaacaag 


2040 


accccgccgg 


cctgaagaag 


2100 


tcagcgtgcc 


cctqqacqaq 


2160 


** 3 3 N» C*3*- fc»y w 


aga t c 1 1 age 


2220 


actcccaaag 


aagacaagat 


2280 


c tgatccaag 


y au yyy L yy L 


2340 


aaoaaaaaat 


y ay «v»y c*y w w 


2400 


taaaaaaaca 


haoacrrAahr 


2460 


ggct agaagc 


araaaAaaaa 


2520 




rf araaorfn 

v» uav-y ay ucy 


2580 


aaaaaaacaa 

^3 ^Q-^j*— i^W»y 


ctooaccata 


2640 


accaoatrta 


rrrtoorahr 


2700 


ct ctgaccga 


oatafl 1" rrfr 

33 v»y ck w w w w 


2760 


agatcctgaa 


qqaqcccqtq 


2820 


agatccagaa 


3 ua 333 ul * a y 


2880 


acctgaagac 


eggcaagtae 


2940 


tqaccqaaac 


catacaaaaa 


3000 


agt tcaagc t 


gcccatccag 


3060 


ccacctggat 


tcccqaqtqq 


3120 


aactaaaaaa 


3 ynyv<w\»an« 


3180 


aaaaaaccaa 


actqqacaaa 

3 3 _j 3 3 


3240 


ccctgaccaa 


caccaccaac 


3300 


actcaggcct 


aaaoqtqaac 


3360 


cccagcccga 


caaaaacaaa 

v»»^3 °5»»g>*g 


3420 


aaaaaaaaat 


gtacctggcc 


3480 


tggacaagct 


ggtgagcgcc 


3540 


cccaggacga 


gcacgagaag 


3600 


tgcctcccgt 


ggtggctaag 


3660 


acgagacccc 


cggcatccgc 


3720 


ccgccatctt 


ccagagcagc 


3780 


acatcgtgat 


ctatcagtac 


3840 
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atggacgacc tgtacgtggg cagcgacctg gagatcggcc agcacaggac caagatcgag 3900 
gagctgaggc agcacctgct gaggtggggc ctgaccaccc ccgacaagaa gcaccagaag 3960 
gagcccccat tcctgtggta a 3 981 

<210> 5 
<211> 1539 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: sequence with 
optimized codons 



<400> 5 

atgggcgcca gggccagcgt gctgagcggc 
ctgaggcccg gcggcaagaa gaagtataag 
ctggagaggt tcgccgtgaa ccccggcctg 
ctgggccagc tgcagcccag cctgcagacc 
accgtggcca ccctgtactg cgtgcaccag 
ctggacaaga tcgaggagga gcagaacaag 
gacaccggcc acagcagcca ggtgagccag 
cagatggtgc accaggccat cagccccagg 
gagaaggcct tcagccccga ggtgatcccc 
ccccaggacc tgaacaccat gctgaacacc 
ctgaaggaga ccatcaacga ggaggccgcc 
ggccccatcg cccccggcca gatgagggag 
agcaccctgc aggagcagat cggctggatg 
atctacaaga ggtggatcat cctgggcctg 
agcatcctgg atatcaggca gggccccaaa 
tacaagaccc tgcgcgccga gcaggccagc 
ctgctggtgc agaacgccaa ccccgactgc 
gccaccctgg aggagatgat gaccgcctgc 
a999tgctgg ccgaggccat gagccaggtg 
ggcaacttca ggaaccagag gaagatggtg 
accgccagga actgccgcgc ccccaggaag 
caccagatga aggactgcac cgagaggcag 
tacaagggca ggcccggcaa cttcctgcag 
ctgcagagca ggcccgagcc caccgccccc 
accaccaccc ctcctcagaa gcaggagccc 
ctgaggagcc tgttcggcaa cgaccccagc 



ggcgagctgg 


acaggtggga 


gaagatcagg 


60 


ctgaagcaca 


tcgtgtgggc 


cagcagggag 


120 


ctggagacca 


gcgagggctg 


caggcagatc 


180 


ggcagcgagg 


agctgaggag 


cctgtacaac 


240 


aggatcgaga 


tcaaggacac 


caaggaggcc 


300 


tccaagaaga 


aggcccagca 


ggccgccgcc 


360 


aactacccca 


tcgtgcagaa 


catccagggc 420 


accctgaacg cctgggtgaa 


ggtggtggag 


480 


atgttcagcg 


ccctgagcga 


gggagccacc 


540 


gtgggcggcc 


accaggccgc 


catgcagatg 


600 


gagtgggaca 


gggtgcaccc 


cgtgcacgcc 


660 


ccccgcggca 


gcgacatcgc 


cggcaccacc 


720 


accaacaacc 


cccccatccc 


cgtgggcgaa 


780 


aacaagatcg 


tgaggatgta 


cagccccacc 


84 0 


gagcccttca 


gggactacgt 


ggacaggttc 


900 


caggaggtga 


agaactggat 


gaccgagacc 


960 


aagaccatcc 


tgaaggccct 


gggacccgcc 


1020 


cagggcgtgg 


gcggccccgg 


ccacaaggcc 


1080 


accaacaccg 


ccaccatcat 


gatgcagagg 


1140 


aagtgcttca 


actgcggcaa 


ggagggccac 


1200 


aagggctgct 


ggaagtgcgg 


caaggagggc 


1260 


gccaacttcc 


tgggcaagat 


ctggcccagc 


1320 


agcaggcccg 


agcccaccgc 


cccccccttc 


1380 


cccgaggaga 


gcttcaggag 


cggcgtggag 


1440 


atcgacaagg 


agctgtaccc 


cctgaccagc 


1500 


agccagtga 
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Claims 



1. Nucleic acid sequence encoding the gag and pol polypeptides, whereby the amino acid Ala is encoded by the 
nucleic acid triplett gcc or get, Arg is encoded by agg or aga, Asn is encoded by aac or aat, Asp is encoded by 

5 gac or gat, Cys is encoded by tgc ortgt, Gin is encoded by cag or cat, Glu is encoded by gag or gaa, Gly is encoded 

by ggc or gga, His is encoded by cac or cat, He is encoded by ate or att, Leu is encoded by ctg or etc, Lys is encoded 
by aag or aat, Met is encoded by atg, Phe is encoded by ttc or ttt, Pro is encoded by ccc or cct, Ser is encoded 
by age or tec, Thr is encoded by acc or aca, Trp is encoded by tgg, Tyr is encoded by tac or tat, Val is encoded by 
gtg ot gtc and the stop codon is tga or taa. 

10 

2. Nucleic acid sequence according to claim 1 , whereby the nucleic acid sequence in the region in which the reading 



frames encoding the gag and pol polypeptides overlapp corresponds to the wildtyp nucleic acid sequence encoding 
the gag and pol polypeptides. 



15 3. Nucleic acid sequence as depicted in SEQ ID NO:1 or 2. 

4. Retroviral gag or gagpol based particle generated by nucleic acid molecules comprising the nucleic acid sequence 
according to any one of claims 1 to 3 having at least a 25 fold reduced incorporation rate of said nucleic acid 
molecules encoding the gag and pol polypeptides compared to retroviral particles generated by wildtyp nucleic 
\20 acid sequences encoding the gag and pol polypeptides. 



5. Retroviral gag or gagpol based particle produced by nucleic acid molecules comprising the nucleic acid sequence 
according to any one of claims 1 to 3 having at least a 100 fold reduced recombination rate between said nucleic 
acid molecules encoding the gag and pol polypeptides and a wildtype genome based transgene construct derived 
from the same or another retrovirus. 

6. Retroviral packaging cell for the generation of the retroviral particles according to claims 4 or 5 transformed with 
nucleic acid molecules comprising the nucleic acid sequence according to any one of claims 1 to 3. 

7. Nucleic acid molecules comprising the nucleic acid sequence according to any one of claims 1 to 3 for use as an 
active pharmaceutical substance. 

8. Use of nucleic acid molecules comprising the nucleic acid sequence according to any one of claims 1 to 3 for the 
preparation of a pharmaceutical composition for the treatment and prophylactic of diseases caused by retroviruses. 




35 




45 
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Fig.l 
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Fig. 2A/1 



1 ATGGGCGCCAGGGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAGGTGGGA 50 

Nil! It II II II I I I I I t II II III I Mill 

1 ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATCGATGGGA 50 

51 GAAGATCAGGCTGAGGCCCGGCGGCAAGAAGAAGTATAAGCTGAAGCACA 100 

II I I I I I I II II I I I I I M I I I I I M I I I I I I I I 
51 AAAAAT TCG GT T AAGGCC AGGGGG AAAG AAAAAAT AT AAAT T AAAACA7A 100 

101 TCGTGTGGGCCAGCAGGGAGCTGGAGAGGTTCGCCGTGAACCCCGGCC~G 150 

I II I I I I I I I I I I I I I M I II I I I I I I II I t II I 1 I 111 
101 TAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAATCCTGGCCTG 150 

151 CTGGAGACCAGCGAGGGCTGCAGGCAGATCCTGGGCCAGCTGCAGCCCA3 200 

I II II II Mill II II II Mill I II II M M 

151 TTAGAAACATCAGAAGGCTGTAGACAAATACTGGGACAGCTACAACCATC 200 

201 CCTGCAGACCGGCAGCGAGGAGCTGAGGAGCCTGTACAACACCGTGGCCA 250 

III IMM M II II II II III M II Mill 

201 CCT T CAG AC AGGATCAG AAGAACT T AGATC AT T ATAT AAT ACAGT AGCAA 250 

251 CCCTGTACTGCGTGCACCAGAGGATCGAGATCAAGGACACCAAGGAGGCC 300 

MM II I I Mill 11 Mill IMM 11 IMM Mill! II 
251 CCCTCTATTGTGTGCATCAAAGGATAGAGATAAAAGACACCAAGGAAGCT 300 

301 CT GG AC AAGAT CG AGGAGG AGC AG AAC AAGTC C AAGAAGAAGGCCC AGCA 350 
I llllilli I M I I IMM Mill II I I I M I I II I 1 1 

301 T T AG AC AAG AT AGAGGAAG AGCAAAAC AAAAG T AAG AAAAAAGCACAGCA 350 
« • • • * 

351 GG CCGCCGCCG ACACCGGCC ACAGCAGCCAGGT GAGCC AGAACT ACCCCA 400 
M li II Mill II M I M i II Mill IMM II Mill I 

351 AGCAGCAGCTGACACAGGA.CACAGCA.GTCAGGTCAGCCAAAATTACCCTA 400 

4 01 TCGTGCAGAACATCCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCAG5 450 

i II II M M M M I M II II II M I M I II II II I II II 
401 TA.GTGCAGAACATCCAGGGGCAAATGGTACATCAGGCCATATCACCTAGA 450 

4 51 ACCCTGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCCGA 500 

II IMM Mill II II It M Mill Ml I I I I I I 1 1 II 

4 51 ACT TT AAATGC AT GGGT AAAAGT AGT AG AAG AG AAGGC TTTCAGCCCAG A 500 

501 GGTGATCCCCATGTTCAGCGCCCTGAGCGAGGGAGCCACCGCCCAGGA.ee 550 

M I I MINIM III II ! (1 II I I M If II 1 1 
501 AGT AAT ACCCATG T T TTCAGCAT T A.TC AG AAGG AGCC ACCCC AC AAGAT? 550 

551 TGAACACCATGCTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATG 6O0 

I Mlllllilll INN INN N N N II IIIIIIM Ml 
551 T AAACACC ATGC T AAAC ACAGTGGGGGGACAT C AAGC AGCCATGCAAATG 600 

601 CTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACAGGGTGCACCC 650 

I N II N t N N II INN II II II IMM II N II If 
601 TTAAAAGAGACCATCAATGAGGAAGCTGCAGAATGGGATAGAGTACATCC 650 

651 CGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGAGGGAGCCCCGCGGCA 700 

I I I I I 1 1 II II II N II N II II II II I II M I It I 
651 AGTGCATGCAGGGCCTATTGCACCAGGCCAGATGAGAGAACCAAGGGGAA 700 

701 GCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGGCTGGATG 750 

I IMM N N 11 II II INN IMM II II II MINI 
701 GTGACATAGCAGGAACTACTAGTACCCTTCAGGAACAAATAGGATGGATG 750 

751 ACCAACAACCCCCCCATCCCCGTGGGCGAAATCTACAAGAGGTGGATCA^ 800 

II M II II II INN II II INN II 11 II INN 11 
751 ACAAATAATCCACCTATCCCAGTAGGAGAAATTTATAAAAGATGGATAAT 800 
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Fig. 2 A/2 

• • • • • 
801 CCTGGGCCTGAACAAGATCGTGAGGATGTACAGCCCCACCAGCATCC7GG 850 

lltllt I II II II I I M Mill I I I I I I II ( I I I I INI 
801 CCTGGGATTAAATAAAATAGTAAGAATGTATAGCCCTACCAGCATTCTGG 850 

851 ATATCAGGCAGGGCCCCAAAGAGCCCTTCAGGGACTACGTGGACAGGTTC 900 

I II II II II II lllll 11 II II INN II Ml Mill 
851 AC AT AAG AC AAGGACCAAAAG AACCT T T T AG AGACT ATG T AG ACCGGTTC 900 

901 TACAAGACCCTGCGCGCCGAGCAGGCCAGCCAGGAGGTGAAGAACTGGAT 950 

II M II I I I 1 1 1 1 1 1 1 1 M MMMM II If Mill 
901 T AT AAAACTCT AAGAGCCGAGC AAGCT T CAC AGGAGGT AAAAAAT TGG AT 950 

951 GACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCC 1000 

III M ill II 1 1 I I II II M Mill II M Mill II 

951 GACAGAAACCTTGTTGGTCCAAAATGCGAACCCAGATTGTAAGACTATTT 1000 

• • • • • 
1001 TGAAGGCCCTGGGACCCGCCGCCACCCTGGAGGAGATGATGACCGCCTGC 1050 

lllll M II II I II M II II II M MM II I I II 1 1 
1001 TAAAAGCATTGGGACCAGCGGCTACACTAGAAGAAATGATGACAGCATGT 1050 

1051 CAGGGCGTGGGCGGCCCCGGCCACAAGGCCAGGGTGCTGGCCGAGGCCAT 1100 

! M II II M II I I II I I M I M M II M MM M M II 
1051 CAGGG AGT AGGAGG ACCCGGCCATAAGGCAAGAGTTTTGGCTGAAGCAAT 1100 

1101 G AGCC AGGTGACCAAC ACCGCCACCATCATGATGCAGAGGGGCAACTTCA 1150 

M II II II I I II II I I I I I M M II M M M I Mill M I 
1101 GAGCCAAGTAACAAATACAGCTACCATAATGATGCAGAGAGGCAATTTTA 1150 

1151 GG AACC AGAGG AAGAT GGTG AAG TGC T TC AACT GCGGCAAGG AGGGCCAC 1200 
MIMM II I 1 1 1 1 1 1 1 Mill Mill M Mill II II III 

1151 GG AACCAAAGAAAG ATGGTT AAGT GT TTC AAT T GTGGCAAAGAAGGGCAC 1200 

1201 ACCGCCAGGAACTGCCGCGCCCCCAGGAAGAAGGGCTGCTGGAAGTGCGG 1250 

II Mill II III I Mill Mill MMMM Mill II II 
1201 ACAGCCAGAAATTGCAGGGCCCCTAGGAAAAAGGGCTGTTGGAAATGTGG 1250 

1251 CAAGGAGGGCCACCAGATGAAGGACTGCACCGAGAGGCAGGCCAACTTCC 1300 

Mill II II 1 1 I lllll I I M 11 II I M Mill II M 
1251 AAAGG AAGGAC ACCAAATGAAAG A? X G T AC TG AGAG ACAGGCTAATT TT T 1300 

1301 TGGGCAAGATCTGGCCCAGCTACAAGGGCAGGCCCGGCAACTTCCTGCAG 1350 

I I I It M M II II I HMIMM Mill II II II II Ml 
1301 TAGGGAAGATCTGGCCTTCCTACAASGGAAGGCCAGGGAATTTTCTTCAG 1350 

1351 AGCAGGCCCGAGCCCACCGCCCCCCCCTTCCTGCAGAGCAGGCCCGAGCC 1400 

Mill M Mill IT Mill It M M MMMM II Mill 
1351 AGC AG ACC AG AGCCAAC AGCCCC ACCATT T CT T C AG AGCAG ACC AGAGCC 1400 

1401 CACCGCCCCCCCCGAGGAGAGCTTCAGGAGCGGCGTGGAGACCACCACCC 1450 

M Mill II II I I II M I I 1 1 M II II Mill II II 1 
14 01 AACAGCCCCACCAGAAGAGAGCTTCAGGTCTGGGGTAGAGACAACAAC7C 1450 

1451 CCCCCCAGAAGCAGGAGCCCATCGACAAGGAGCTGTACCCCCTGACCAGC 1500 

MM I I 1 ! I I 1 1 1 1 1 f 1 1 II MMMM Mill II ill I 
14 51 CCCCTCAGAAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCC 1500 

1501 CTGAGGAGCCTGTTCGGCAACGACCCCAGCAGCCAGTGA 1539 SYNTHETISCH 

MM II II MINIMUM II I 1 

1501 CTCAGATCACTCTTTGGCAACGACCCCTCGTCACAATAA 1539 WILDTYP 
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Fig. 2B 
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Fig. 3 A f 
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Fig. 5A 
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Fig. 6A 
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